Lab2 Assembly Lab I
Lab2 Assembly Lab I
Assembly Lab I
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Outline
1. ISA Introduction
2. RISC-V Introduction
3. Assembly Introduction
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
ISA Introduction
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
How the Hardware executes the C code ?
hardware
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
How the Hardware executes the C code ?
• Convert “C code” to “machine code” according the instructions that hardware supports
Think… software
What will happen
when different
Convert Compiler hardware supports
different instructions ?
hardware
software
Convert Compiler
ISA
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
What’s ISA
ISA
hardware
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
What’s ISA
hardware
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
What’s ISA
hardware
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
What’s ISA
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
What’s ISA
Data : 0x1234abcd
Besides defines Instruction Set… Little Endian
… 0xcd 0xab 0x34 0x12 …
ISA also defines 4n 4n+1 4n+2 4n+3
software
12
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
How a CPU execute a program ?
RAM Byte Address Byte
inst_base_address + 0x? …
.
Process .
Little Endian
.
Program OS / Loader
Instruciton1
Storage (SSD / HDD) Instruction2
Instruction3
. Program
.c .s .o .elf .bin .
.
13
User program Machine program .elf / .bin
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Register & Main Memory
Register
RAM
0xbfffffff0
x0
Stack
Fetch x1
x2 Register is faster than RAM
…
CPU x29
So, RISC need to load data
from memory to registers
x30 before computing
Execute Decode x31
Dynamic Data / Heap
pc
Program Counter +4
Static Data
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
RISC-V Introduction
15
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
History of ISA : RISC versus CISC
• The instruction set of processors can be simply divided into two types, RISC (reduced instruction set computer)
and CISC (complex instruction set computer).
• In the beginning, there was no distinction between RISC and CISC. At that time, the technology of compilers
was not mature, and programs were written directly in machine code or assembly languages.
In order to reduce the design time of programmers, a single instruction code with complex operations was
gradually developed, so that programmers only had to write simple instructions.
• The research indicates that only about 20% of the instructions in the entire instruction set are often used,
accounting for about 80% of the program; the remaining 80% of the instructions account for only 20% of the
program.
• Because the more instructions supported will make the circuit more complex and increase the cost and energy
consumption. In 1979, Professor David Patterson of the University of California, Berkeley, proposed the idea of
RISC, suggesting that hardware should focus on accelerating commonly used instructions, while more complex
instructions should be combined with commonly used instructions, and then divided into RISC and CISC.
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Comparison of RISC & CISC
RISC (Reduced Instruction Set Computer) CISC (Complex Instruction Set Computer)
Instruction Count Less (Reduced) Plentiful (Complex)
Simple Complex
Design easy Design hard
CPU Microarchitecture
Debug easy Debug hard
Low power High energy consumption
Registers or Immediate
Operands Register or Immediate or Memory
( main memory is slow, register is fast ) Data in memory can be processed directly without loading into registers
Data in memory needs to be loaded into registers before processing
Large Short
Code size
( complex instructions = many small instructions ) ( complex calc = small amounts of instructions )
Example MIPS, ARM, RISC-V X86 17
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
RISC-V - Introduction
• RISC-V (pronounced “risk-five”) is started by students from UC Berkeley in May 2010 as part of the
Parallel Computing Laboratory (Par Lab), of which Prof. David Patterson was Director.
• The conventional approach to computer architecture is incremental ISAs, where new processors must
implement not only new ISA extensions but also all extensions of the past.
• RISC-V is modular. At the core is a base ISA, called RV32I, which will never change.
The modularity comes from optional standard extensions that hardware can include or not depending on
the needs of the application.
• The goal of the RISC-V Foundation is to maintain the stability of RISC-V, evolve it slowly and
carefully, solely for technical reasons, and try to make it as popular for hardware as Linux is for
operating systems.
18
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
RISC-V - Feature
• Open
RISC-V ISA is provided under open source licenses that do not require fees to use.
Deliver easier support from a broad range of operating systems, software vendors and tool developers.
• Stable
Base and first standard extensions are already ratified. There is no need to worry about updates.
• Simple
Only few number of instructions in RISC-V.
• Elegant
• Each instruction is the same length.
• Instruction format is fixed.
• The signed bit is always on the leftmost side of the instruction. 19
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
RISC-V - ISA We learn this base ISA this time
# Registers
Weak Memory Ordering
• Format : one base + optional extensions 32 Base Integer Instruction Set, 32-bit
32 Base Integer Instruction Set, 64-bit
16 Base Integer Instruction Set (embedded), 32-bit
32 Base Integer Instruction Set, 128-bit
• Naming Convention (規範)
Integer Multiplication and Division
Atomic Instructions
extension has order convention Single-Precision Floating-Point
Double-Precision Floating-Point
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
RISC-V - RV32I
Byte Address 7 RAM 0
• XLEN (Integer Register Width) = 32 (bits)
𝟐𝟑𝟐 − 𝟏 = 0xfffffffff
…
• IALIGN (Instruction-address Alignment) = 32 (bits)
( If add Compressed extension, IALIGN => 16 (bits) )
Little Endian
• ILEN (Maximum Instruction Length) = 32 (bits) 4(n+1)+3 instruction 2 [31:24]
(Always a multiple of IALIGN)
4(n+1)+2 instruction 2 [23:16]
Width = XLEN 4(n+1)+1 instruction 2 [15:8]
4(n+1)+0 instruction 2 [7:0]
Register x0 4n+3 instruction 1 [31:24]
Fetch x1 4n+2 instruction 1 [23:16]
x2 IALIGN = 32 4n+1 instruction 1 [15:8]
Fetch Address % 4 == 0 4n+0 instruction 1 [7:0]
CPU …
x29
…
x30
Execute Decode +4 Fetch (pc+4)
x31
pc 0x00000001 21
0x00000000
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
RISC-V - ISA Questions !
Q1 :
Why IALIGN of RV32I is 32 bits ?
Why IALIGN need to support 16 bits ?
Why IALIGN have no greater than 32 bits (e.g. 64 / 128 bits) ?
22
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
RISC-V - Register The same
add x5, x9, x17
RV32I add t0, s1, a7
• 32 registers, each has 32 bits
• You can use these registers at will when you write your own
programs. It will work fine.
• But if you need to use others code or co-work with others,
then you need to follow the convention of RISC-V !!!
• x : integer registers Stack bottom RAM
• x0 : always 0, connect to GND
• x1 (ra) : save return address
0xbfffffff0
• x2 (sp) : point to top of stack Stack
Callee
sp
ra = 104 524
Caller
Heap
100
……
Static Data
104
Text / Code
Additional Register : Program Counter 0x00010000
points to the last fetched instruction 796 pc = ra Reversed
23
0x00000000
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
RISC-V - Register
RV32I
• t0 ~ t6 : temporary register
• Caller saved : The programmer does not guarantee that the
register will not be changed after the call function returns
Caller
…
Store 1 3
100 sp
…
saved registers
data of
104 saved registers 2
…
Retrieve sp
Additional Register : Program Counter saved registers
points to the last fetched instruction 796 pc = ra 24
RAM
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
RISC-V - Register Questions !
Q2 :
Why temporary registers and saved registers
are not numbered sequentially ?
Q3 :
Why return value needs 2 registers (a0, a1) ?
a0 ~ a1 a0 ~ a7
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
RISC-V - Overview of RV32I Instructions
26
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
RISC-V - Instruction Formats
• According to the used source & destination registers, we can divide the RISC-V ISA to 4 types of formats
4 types : R / I / S / U int a = 10
int b = a + 1
int c = a + b
R : rd / rs1 / rs2 (Register)
I : rd / rs1 (Immediate)
S : rs1 / rs2 (Store)
U : rd (Upper)
• In advance, according the bit position of immediate, we can extend it to 6 types of formats (4 + 2 variants)
6 types : R / I / S / (B) / U / (J)
S : rs1 / rs2 / imme[11:0]
(Branch) (B) : rs1 / rs2 / imme[12:1]
U : rd / imme[31:12]
(Jump) (J) : rd / imme[20:1]
27
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
RISC-V - Instruction Formats
opcode : the operation of the instruction
In order to simplify hardware implementation
func3 : support opcode to express the operation
• Fixed the position of opcode / func3 / func7
func7 : support opcode to express the operation
• Fixed the position of used register (rs1 / rs2 / rd)
rs1 : source register 1
• Signed bits of immediate is always at inst[31]
rs2 : source register 2
• Let the bits of the immediate overlap as much as possible
rd : destination register
32 Bits (RV32I)
Formats
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
28
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Assembly Introduction
29
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
What’s Assembly Language
Too difficult to write !!!
software
line-by-line
Machine Code Assembly code
Assembly Language
• A type of low-level programming language
hardware • Correspond almost line by line with machine code
• Communicate directly with a computer’s hardware
• Readable by humans
• Also Often known as “symbolic machine code”
30
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Why we need to learn assembly code
Reference : https://www.techopedia.com/why-is-learning-assembly-language-still-important/7/32268
• Despite the prevalence(流行度) of high-level languages that are mainly used for the development of applications and
software programs, the importance of assembly language in today’s world cannot be understated(不可小覷).
• It communicates hardware directly and see how the processor and memory work.
• Assembly language is the gateway(途徑) to optimization in speed, thereby offering great efficiency and performance.
Divide Correspond
Assembler
Compiler
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
So, Let us learn how to write assembly language !
Signed-extension :
An instruction has only 32 bits.
Take I-type for example :
It can’t fully express an immediate when it also need to include other info except immediate.
When CPU decodes the instruction, imme[11:0] = 1100_0010_0000
it will do signed extension for imme = 1111_1111_1111_1111_1111_1100_0010_0000 (XLEN)
immediate. imme[11:0] = 0100_0010_0000 32
Extend it to XLEN bits. imme = 0000_0000_0000_0000_0000_0100_0010_0000 (XLEN)
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Instruction classification of RV32I depends on functionality
Arithmetic Shift
• Computation Instruction Set Logical
• Register – Register (add, sub, slt, sltu, sll, srl, sra, xor, or, and)
• Register – Immediate (addi, slti, sltiu, slli, srli, srai, xori, ori, andi)
• Long Immediate (lui, auipc)
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Computation Instruction – (Register - Register)
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Computation Instruction – (Register - Immediate)
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Computation Instruction – (Long Immediate)
t0 = 0xc8763000
if pc = 0x666
t0 = 0xc8763666
36
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Computation Instruction – Application
• Load a large immediate (0xabcd1234) to x5 0x234 = 0010_0011_0100
x5 = 0xabcd1000
lui x5, 0xabcd1 0x234 = 0010_0011_0100
addi x5, x5, 0x234 sext (0xbcd) = 0000_0000_0000_0000_0000_0010_0011_0100 = 0x00000234
x5 = 0xabcd1000 + 0x00000234 = 0xabcd1234
• Load a large immediate (0x1234abcd) to x5 0xbcd = 1011_1100_1101 = -0x433
x5 = 0x1234a000
lui x5, 0x1234a -0x433 = 1011_1100_1101
addi x5, x5, 0xbcd sext (-0x433) = 1111_1111_1111_1111_1111_1011_1100_1101 = 0xfffffbcd
x5 = 0x1234a000 + 0xfffffbcd = 0x12349bcd
-0x433
x5 = 0x1234b000
lui x5, 0x1234b -0x433 = 1011_1100_1101
addi x5, x5, 0xbcd sext (-0x433) = 1111_1111_1111_1111_1111_1011_1100_1101 = 0xfffffbcd
x5 = 0x1234b000 + 0xfffffbcd = 0x1234abcd
• Multiplication / Division
• 𝑥5 = 𝑥5 × 2 slli x5, x5, 1
Only 12 bits 0x234
• 𝑥5 = 𝑥5 × 3 slli x6, x5, 1
Decimal : -2048 ~ 2047 37
0xbcd -0x433
Hexadecimal : -0x800 ~ 0x7ff
addi x5, x6, x5
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Computation Instruction – Application
• Logical Operation - AND (Keep only the masked bits)
and x5, x6, x7 x6 0 0 1 1 0 1 0 0 1 1 0 0 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 1 0 1 1
and (mask) x7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0
x5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0
• Logical Operation - OR (Append on original data)
or x5, x6, x7 x6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0
or x7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0
x5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0
• Logical Operation - XOR
A B Q 1. (two operands) The same is “0", not same is “1”
0 0 0
0 1 1 2. (two operands) “1” can complement a bit whether it was 0 or 1
1 0 1 38
1 1 0
3. (multi operands) Odd number of 1 is “1”, even is “0”
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Load & Store Instruction
8 bits 16 bits 32 bits
• lb / lh / lw (Load [byte / halfword / word]) format
• rd = signed-extension( RAM[rs1 + simm12] [7/15/31 : 0] ) Load : op rd, simm(rs1)
• lbu / lhu (Load [byte / halfword] unsigned) Store : op rs2, simm(rs1)
• rd = unsigned-extension( RAM[rs1 + simm12] [7/15/31 : 0] )
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Load & Store Instruction – Example & Questions !
…
4n+20 0x78 0x56 0x34 0x12
lw t0, 8(x26)
…
lw t1, 20(x26) 4n+8 0x99 0x88 0xff 0x00
add t2, t0, t1 4n+4 0x11 0x22 0x33 0x44
…
0x00000000
Offset 0 Offset 1 Offset 2 Offset 3
Q5 : What is the addressing mode of Load & Store ? 40
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Pseudo Instruction (偽指令)
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Assemble Directives
• Directive is not the CPU instructions, there are two purposes : Reference :
1. To prompt the assembler to do something http://godleon.blogspot.com/2008/01/m
2. To inform the assembler of certain information achine-language-cpu-machine-
language.html
• Directive will not be converted into machine code by the assembler
There are several common uses of the directive :
1. Define constants
2. Define the memory location for storing data
3. Include external source code as appropriate
4. Include other files
42
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Format Conversion Flow
User program .c … .c C
ISA + Pseudo Instructions
Compiler
User program
Label
A label can be placed at the beginning of a statement, .s … .s Assembly
which represent current value of the active location
counter and can be serves as an instruction operand.
Assembler
00010054 <main>:
10054: 00100293
10058: 00129313
li
slli
t0,1
t1,t0,0x1 Object .o … .o lib.o Library
1005c: 00530333 add t1,t1,t0
10060: 00130393 addi t2,t1,1 Linker (Link Script)
10064: 1234bf37 lui t5,0x1234b
10068: bcdf0f13 addi t5,t5,-1075
Machine program Disassemble .elf 43
Only ISA
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999
Question List
Q1 :
Why IALIGN of RV32I is 32 bits ?
Why IALIGN need to support 16 bits ?
Why IALIGN have no greater than 32 bits (e.g. 64 / 128 bits) ?
Q2 : Why temporary registers and saved registers are not numbered sequentially ?
NCKU Electrical Engineering, Computer Architecture and System Laboratory, Since 1999