02 Assembly
02 Assembly
Assembly language
Simple, regular instructions – building blocks of C, Java & other languages
Typically one-to-one mapping to machine language
Our goal
Understand the basics of assembly language
Help figure out what the processor needs to be able to do
1
Aside: C/C++ Primer
struct coord { int x, y; }; /* Declares a type */
struct coord start; /* Object with two slots, x and y */
start.x = 1; /* For objects “.” accesses a slot */
struct coord *myLoc; /* “*” is a pointer to objects */
myLoc = &start; /* “&” returns thing’s location */
myLoc->y = 2; /* “->” is “*” plus “.” */
x
y
int scores[8]; /* 8 ints, from 0..7 */
scores[1]=5; /* Access locations in array */
int *index; // declare pointer
index = scores; // equivalent to index = &scores[0];
int *index = scores; /* Points to scores[0] */
index++; /* Next scores location */
(*index)++; /* “*” works in arays as well */
index = &(scores[3]); /* Points to scores[3] */
*index = 9;
2
0 1 2 3 4 5 6 7
ARM Assembly Language
The basic instructions have four components:
Operator name
Destination
1st operand
2nd operand
More complex: A = B + C + D – E
LDUR X2, B
LDUR x3, C
ADD X1, X2, X3 // assumes B is in X2, C is in X3
ADD X1, X1, X4 // assumes D is in X4
SUB X1, X1, X5 // assumes E is in X5 and A is left in X1
STUR X1, A
3
Operands & Storage
For speed, CPU has 32 general-purpose registers for storing most operands
For capacity, computer has large memory (multi-GB)
Computer
Processor Memory Devices
Control Input
Datapath
GPRs Output
4
Registers
32x 64-bit registers for operands
5
Basic Operations
(Note: just subset of all instructions)
Shift: left & right logical (LSL, LSR) LSL X0, X1, #4 // X0 = X1<<4
Example: Take bits 6-4 of X0 and make them bits 2-0 of X1, zeros otherwise:
x1 = (x0 >> 3) & 0x7 // in C
LSR x1, x0, #3
ANDI x1, x1, #7
6
Memory Organization
Viewed as a large, single-dimension array, with an address.
A memory address is an index into the array
"Byte addressing" means that the index points to a byte of memory.
0 8 bits of data
1 8 bits of data
2 8 bits of data
3 8 bits of data
4 8 bits of data
5 8 bits of data
6 8 bits of data
...
7
Memory Organization (cont.)
Bytes are nice, but most data items use larger units.
Double-word = 64 bits = 8 bytes
Word = 32 bits = 4 bytes
0 64 bits of data
8 64 bits of data
Registers hold 64 bits of data
16 64 bits of data
24 64 bits of data
8
Addressing Objects: Endian and Alignment
9
Data Storage
Characters: 8 bits (byte)
Integers: 64 bits (D-word)
1000 1016 1032
Array: Sequence of locations 1001 1017 1033
Pointer: Address (64 bits) 1002 1018 1034
1003 1019 1035
1004 1020 1036
1005 1021 1037
// G = ASCII 71 1006 1022 1038
char a = ‘G’;
int x = 258; 1007 1023 1039
char *b; 1008 1024 1040
int *y; 1009 1025 1041
b = new char[4];
1010 1026 1042
y = new int[10];
1011 1027 1043
1012 1028 1044
1013 1029 1045
1014 1030 1046
1015 1031 1047
(Note: real compilers place local variables (the “stack”) at the top of memory, 10
new’ed structures (the “heap”) from near but not at the beginning. We ignore that here for simplicity)
Loads & Stores
Loads & Stores move data between memory and registers
All operations on registers, but too small to hold all data
General Memory
Purpose Load
Registers 24:
X0:
X1: 130
X2: 723 Store 144: 66
X3: 4
11
Addressing Example
The address of the start of a character array is stored in X0. Write assembly to load the
following characters
X2 = Array[0]
X3 = Array[1]
X4 = Array[2]
12
Stretch!
13
Array Example
/* Swap the kth and (k+1)th element of an array */
swap(int v[], int k) { Memory
int temp = v[k]; GPRs 1000 0A12170D34BC2DE1
v[k] = v[k+1]; Load
X0: 928 1008 1111111111111111
v[k+1] = temp;
}
X1: 10 1016 0000000000000000
X2: 1024 0F0F0F0F0F0F0F0F
X3: 1032 FFFFFFFFFFFFFFFF
// Assume v in X0, k in X1 Store
X4: 1040 FFFFFFFFFFFFFFFF
14
V[0]=mem[X0]=mem[928] V[1]=mem[X0+8]=mem[936]
Array Example V[k]=mem[X0+8*k] V[k+1]=mem[X0+8(k+1)]=mem[X0+8k+8]
15
Execution Cycle Example
PC: Program Counter
IR: Instruction Register Instruction
Memory
Note: Fetch
Word addresses 0000 D3600C22
Instructions are 32b 0004 8B020002
Instruction
0008 F8400043
General 0012 F8408044 Decode
Purpose
0016 F8400044
Registers
0020 F8408043 Operand
X0: 928 Fetch
X1: 10 Load
X2: Execute
X3:
X4: 1000 0A12170D34BC2DE1
Result
Store 1008 1111111111111111
1016 0000000000000000 Store
PC: 1024 0F0F0F0F0F0F0F0F
1032 FFFFFFFFFFFFFFFF Next
IR: 1040 FFFFFFFFFFFFFFFF Instruction
16
Flags/Condition Codes
Flag register holds information about result of recent math operation
Negative: was result a negative number?
Zero: was result 0?
Overflow: was result magnitude too big to fit into 64-bit register?
Carry: was the carry-out true?
17
Control Flow
Unconditional Branch – GOTO different next instruction
B START // go to instruction labeled with “START” label
BR X30 // go to address in X30: PC = value of X30
Conditional Branches – GOTO different next instruction if condition is true
1 register: CBZ (==0), CBNZ (!= 0)
CBZ X0, FOO // if X0 == 0 GOTO FOO: PC = Address of instr w/FOO label
if (a == b) // X0 = a, X1 = b, X2 = c
a = a + 3; CMP X0, X1 // set flags
B.NE ELSEIF // branch if a!=b
else
ADDI X0, X0, #3 // a = a + 3
b = b + 7; B DONE // avoid else
c = a + b; ELSEIF:
ADDI X1, X1, #7 // b = b + 7
DONE:
ADD, X2, X0, X1 // c = a + b
18
Loop Example
Compute the sum of the values 0…N-1
int sum = 0;
for (int I = 0; I != N; I++) {
sum += I;
}
// X0 = N, X1 = sum, X2 = I
19
Loop Example
Compute the sum of the values 0…N-1
int sum = 0;
for (int I = 0; I < N; I++) {
sum += I;
}
// X0 = N, X1 = sum, X2 = I // X0 = N, X1 = sum, X2 = I
ADD X1, X31, X31 // sum = 0 ADD X1, X31, X31 // sum = 0
ADD X2, X31, X31 // I = 0 ADD X2, X31, X31 // I = 0
TOP: B TEST // Test@bottom
CMP X2, X0 // Check I vs N TOP:
B.GE END // end when !(I<N) ADD X1, X1, X2 // sum += I
ADD X1, X1, X2 // sum += I ADDI X2, X2, #1 // I++
ADDI X2, X2, #1 // I++ TEST:
B TOP // next iteration CMP X2, X0 // Check I vs N
END: B.LT TOP // if (I<N) cont.
END:
Note: Can you do the loop with less # of branches per iteration?
Branch at bottom of loop, branching back. Branch forward at top to this branc
20
String toUpper
Convert a string to all upper case
char *index = string;
while (*index != 0) { /* C strings end in 0 */
if (*index >= ‘a’ && *index <= ‘z’)
*index = *index +(‘A’ - ‘a’);
index++;
} // string is a pointer held at Memory[80].
// X0=index, ‘A’ = 65, ‘a’ = 97, ‘z’ = 122
the_while:
ldurb x1, [x0] ; x1 = *index
cbz x1, end_while ; while (*index != 0)
cmp x1, #97 ; if (*index < ‘a’…)
b.lt is_upper
cmp x1, #122 ; if (*index > ‘z’….)
b.gt is_upper
sub x1, x1, #32 ; x1 = x1 - ‘a’ + ‘A’
sdurb x1, [x0] ; *index = x1
is_upper:
addi x0, x0, #1 ; index++
b the_while
end_while:
21
String toUpper
Convert a string to all upper case
char *index = string;
while (*index != 0) { /* C strings end in 0 */
if (*index >= ‘a’ && *index <= ‘z’)
*index = *index +(‘A’ - ‘a’);
index++;
}
// string is a pointer held at Memory[80].
// X0=index, ‘A’ = 65, ‘a’ = 97, ‘z’ = 122
LDUR X0, [X31, #80] // index = string
LOOP:
LDURB X1, [X0, #0] // load byte *index
CBZ X1, END // exit if *index == 0
CMPI X1, #97 // is *index < ‘a’?
B.LT NEXT // don’t change if < ‘a’
CMPI X1, #122 // is *index > ‘z’?
B.GT NEXT // don’t change if > ‘z’
SUBI X1, X1, #32 // X1 = *index + (‘A’ - ‘a’)
STURB X1, [X0, #0] // *index = new value;
NEXT:
ADDI X0, X0, #1 // index++;
B LOOP // continue the loop
END:
22
Machine Language vs. Assembly Language
Assembly Language Machine language
mnemonics for easy reading Completely numeric representation
labels instead of fixed addresses format CPU actually uses
Easier for programmers
Almost 1-to-1 with machine language
SWAP:
LSL X9, X1, #3 11010011011 00000 000011 00001 01001
ADD X9, X0, X9 // Compute address of v[k] 10001011000 01001 000000 00000 01001
LDUR X10, [X9, #0] // get v[k] 11111000010 000000000 00 01001 01010
LDUR X11, [X9, #8] // get v[k+1] 11111000010 000001000 00 01001 01011
STUR X11, [X9, #0] // save new value to v[k] 11111000000 000000000 00 01001 01011
STUR X10, [X9, #8] // save new value to v[k+1] 11111000000 000001000 00 01001 01010
BR X30 // return from subroutine 11010110000 00000 000000 00000 11110
23
Stretch!
24
Labels
Labels specify the address of the corresponding instruction
Programmer doesn’t have to count line numbers
Insertion of instructions doesn’t require changing entire code
// X0 = N, X1 = sum, X2 = I
ADD X1, X31, X31 // sum = 0
ADD X2, X31, X31 // I = 0
TOP:
CMP X2, X0 // Check I vs N
B.GE END // end when !(I<N)
ADD X1, X1, X2 // sum += I
ADDI X2, X2, #1 // I++
B TOP // next iteration
END:
Notes:
Branches are PC-relative
PC = PC + 4*(BranchOffset)
BranchOffset positive -> branch downward. Negative -> branch upward.
25
Labels Example
Compute the value of the labels in the code below.
Branches: PC = PC + 4*(BranchOffset)
26
Labels Example
Compute the value of the labels in the code below.
Branches: PC = PC + 4*(BranchOffset)
27
Instruction Types
Can group instructions by # of operands
3-register
ADD X0, X1, X2
ADDI X0, X1, #100
AND X0, X1, X2
2-register ANDI X0, X1, #7
LSL X0, X1, #4
LSR X0, X1, #2
LDUR X0, [X1, #14]
LDURB X0, [X1, #14]
1-register STUR X0, [X1, #14]
STURB X0, [X1, #14]
B START
BR X30
CBZ X0, FOO
B.EQ DEST
0-register
28
Instruction Types
Can group instructions by # of operands
3-register R-Type: no const/tiny const
ADD AND
ADD X0, X1, X2
LSL LSR I-Type: Medium constADDI X0, X1, #100
2-registerADDI ANDI
AND X0, X1, X2
ANDI X0, X1, #7
LSL X0, X1, #4
LDUR LDURB STUR STURBLSR X0, X1, #2
LDUR X0, [X1, #14]
LDURB X0, [X1, #14]
1-register D-Type: Essentially same as I-Type
STUR X0, [X1, #14]
BR CBZ STURB X0, [X1, #14]
B START
BR X30
B.EQ CBZ X0, FOO
0-register B.EQ DEST
B CB-Type: Big const
29
Instruction Formats
All instructions encoded in 32 bits (operation + operands/immediates)
Branch (B-Type) Instr[31:21] = 0A0-0BF
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode BrAddr26
Conditional Branch (CB-Type) Instr[31:21] = 2A0-2A7, 5A0-5AF
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode CondAddr19 Rd
Register (R-Type) Instr[31:21] = 450-458, 4D6-558, 650-658, 69A-758
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode Rm SHAMT Rn Rd
Immediate (I-Type) Instr[31:21] = 488-491, 588-591, 688-691, 788-791
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode ALU_Imm12 Rn Rd
Memory (D-Type) Instr[31:21] = 1C0-1C2, 7C0-7C2
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode DT_Address9 00 Rn Rd
30
B-Type
Used for unconditional branches
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
000101 BrAddr26
0x05: B
B -3 // PC = PC + 4*-3
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
31
B-Type
Used for unconditional branches
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
000101 BR_Address26
05: B
B -3 // PC = PC + 4*-3
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
000101 11111111111111111111111101
-3
32
CB-Type
Used for conditional branches
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode CondAddr19 Rd
Reg or Cond. Code
0x54: B.cond
0xB4: CBZ
0xB5: CBNZ CBZ X12, -3 // if(X12==0) PC = PC + 4*-3
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Condition Codes
0x00: EQ (==)
0x01: NE (!=)
0x0A: GE (>=)
0x0B: LT (<)
0x0C: GT (>) B.LT -5 // if (lessThan) PC = PC + 4*-5
0x0D: LE (<=)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
33
CB-Type
Used for conditional branches
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode Cond_Br_Addr19 Rt
Reg or Cond. Code
54: B.cond
B4: CBZ
B5: CBNZ CBZ X12, -3 // if(X12==0) PC = PC + 4*-3
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Condition Codes
00: EQ (==) 10110100 1111111111111111101 01100
01: NE (!=)
0A: GE (>=) B4 -3 X12
0B: LT (<)
0C: GT (>) B.LT -5 // if (lessThan) PC = PC + 4*-5
0D: LE (<=)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
34
R-Type
Used for 3 register ALU operations and shift
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode Rm SHAMT Rn Rd
0x450: AND Op2 Shift amount Op1 Dest
0x458: ADD (0 for shift) (0 for non-shift)
0x4D6: SDIV, shamt=02
0x4D8: MUL, shamt=1F ADD X3, X5, X6 // X3 = X5+X6
0x550: ORR 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
0x558: ADDS
0x650: EOR
0x658: SUB
0x69A: LSR
0x69B: LSL
0x6B0: BR, rest all 0’s but RdLSL X10, X4, #6 // X10 = X4<<6
0x750: ANDS
0x758: SUBS 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
35
R-Type
Used for 3 register ALU operations and shift
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode Rm SHAMT Rn Rd
Op2 Shift amount Op1 Dest
450: AND
(0 for shift) (0 for non-shift)
458: ADD
4D6: SDIV, shamt=02
ADD X3, X5, X6 // X3 = X5+X6
4D8: MUL, shamt=1F
550: ORR 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
558: ADDS
650: EOR 10001011000 00110 000000 00101 00011
658: SUB
69A: LSR 458 X6 0 X5 X3
69B: LSL
6B0: BR, rest all 0’s but Rd LSL X10, X4, #6 // X10 = X4<<6
750: ANDS 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
758: SUBS
11010011011 00000 000110 00100 01010
69B 0 6 X4 X10
36
I-Type
Used for 2 register & 1 constant ALU operations
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode ALU_Imm12 Rn Rd
Constant - Op2 Op1 Dest
0x244: ADDI
0x248: ANDI
0x164: ADDIS ADDI X8, X3, #35 // X8 = X3+35
0x168: ORRI
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
0x344: SUBI
0x348: EORI
0x2C4: SUBIS
0x2C8: ANDIS
37
I-Type
Used for 2 register & 1 constant ALU operations
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode ALU_Imm12 Rn Rd
Constant - Op2 Op1 Dest
244: ADDI
248: ANDI
164: ADDIS ADDI X8, X3, #35 // X8 = X3+35
168: ORRI
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
344: SUBI
348: EORI
2C4: SUBIS
1001000100 000000100011 00011 01000
2C8: ANDIS
244 35 X3 X8
38
D-Type
Used for memory accesses
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode DAddr9 00 Rn Rd
Address Constant Address Reg Target Reg
0x1C0: STURB
0x1C2: LDURB
0x7C0: STUR
LDUR X6, [X15, #12] // X6 = Memory[X15+12]
0x7C2: LDUR
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
39
D-Type
Used for memory accesses
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
Opcode DT_Address9 00 Rn Rt
Address Constant Address Reg Target Reg
1C0: STURB
1C2: LDURB
7C0: STUR
LDUR X6, [X15, #12] // X6 = Memory[X15+12]
7C2: LDUR
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
40
Conversion example
Compute the sum of the values 0…N-1
B TEST
TOP:
ADD X1, X1, X2
B.LT TOP
END:
41
Conversion example
Compute the sum of the values 0…N-1
42
Assembly & Machine Language
Assembly
Simple instructions
Mnemonics for human developers
(Almost) 1-to-1 relationship to machine language
Machine Language
Numeric representation of instructions
Fixed format(s), simple encode & decode
Directly control microprocessor hardware
43