13 Assembly1
13 Assembly1
Assembly Language:
Part 1
1
Context of this Lecture
First half of the semester: “Programming in the large”
Second half: “Under the hood”
language service
Assembly Language levels Operating System levels
tour tour
Machine Language Hardware
2
Lectures vs. Precepts
Lectures Precepts
3
Agenda
Language Levels
Architecture
Assembly Language: Performing Arithmetic
Assembly Language: Load/Store and Defining Global Data
4
High-Level Languages
Characteristics count = 0;
• Portable while (n>1)
• To varying degrees { count++;
• Complex
• One statement can do
if (n&1)
much work – good ratio of n = n*3+1;
functionality to code size else
• Human readable n = n/2;
• Structured – if(), for(), }
while(), etc.
5
Machine Languages
0000 0000 0000 0000 0000 0000 0000 0000
6
Assembly Languages
mov w1, 0
Characteristics
loop:
• Not portable cmp w0, 1
ble endloop
• Each assembly lang
add w0, w0, #1
instruction maps to one ands wzr, w0, #1
machine lang instruction beq else
• Simple add w2, w0, w0
add w0, w0, w2
• Each instruction does a add w0, w0, 1
simple task b endif
else:
• Human readable asr w0, w0, 1
(In the same sense that Polish is
human readable, if you know Polish.) endif:
b loop
endloop:
7
Why Learn Assembly Language?
Cons
• x86-64 dominates the desktop/laptop, for now
(but there are rumors that Apple is going to shift Macs to ARM…)
9
Agenda
Language Levels
Architecture
Assembly Language: Performing Arithmetic
Assembly Language: Load/Store and Defining Global Data
10
John Von Neumann (1903-1957)
In computing
• Stored program computers
• Cellular automata
• Self-replication
Other interests
• Mathematics
• Inventor of game theory
• Nuclear physics (hydrogen bomb)
Princeton connection
• Princeton Univ & IAS, 1930-1957
RAM
12
Von Neumann Architecture
Data bus
RAM
13
Von Neumann Architecture
Registers CPU
Small amount of storage on the CPU Control
(tens of words in modern machines) Unit
ALU
• Much faster than RAM
• Top of the “storage hierarchy”:
above RAM, disk, etc. Registers
RAM
14
Registers and RAM
Typical pattern:
• Load data from RAM to registers
• Manipulate data in registers
• Store data from registers to RAM
15
Registers (ARM-64 architecture)
63 31 0
x0 w0
x1 w1
…
x29 (FP) w29
sp (stack pointer)
pc (program counter)
nzcv pstate
16
General-Purpose Registers
X0 .. X30
• 64-bit registers
• Scratch space for instructions, parameter passing to/from functions,
return address for function calls, etc.
• Some have special purposes defined in hardware (e.g. X30)
or defined by software convention (e.g. X29)
• Also available as 32-bit versions: W0 .. W30
XZR
• On read: all zeros
• On write: data thrown away
17
SP Register
low memory
Special-purpose register…
STACK frame
• Contains SP (Stack Pointer): SP
address of top (low address) of
current function’s stack frame
high memory
18
PC Register
Special-purpose register…
• Contains PC (Program Counter)
• Stores the location of the next instruction
• Address (in TEXT section) of machine-language
instructions to be executed next
• Value changed:
• Automatically to implement sequential control flow
• By branch instructions to implement selection, repetition
TEXT section
PC
19
PSTATE Register
nzcv pstate
Special-purpose register…
• Contains condition flags:
n (Negative), z (Zero), c (Carry), v (oVerflow)
• Affected by compare (cmp) instruction
• And many others, if requested
• Used by conditional branch instructions
• beq, bne, blo, bhi, ble, bge, …
• (See Assembly Language: Part 2 lecture)
20
Agenda
Language Levels
Architecture
Assembly Language: Performing Arithmetic
Assembly Language: Load/Store and Defining Global Data
21
ALU
CPU
Control
src1 src2 Unit
ALU
operationALU PSTATE
ALU Registers
dest
Data bus
RAM
22
Instruction Format
Many instructions have this format:
src1 src2
23
64-bit Arithmetic
C code: Assume that…
static long length;
• length stored in x1
static long width;
static long perim; • width stored in x2
... • perim stored in x3
perim =
(length + width) * 2;
We’ll see later how to
make this happen
Recall use of
Assembly code: left shift by 1 bit
add x3, x1, x2 to multiply by 2
lsl x3, x3, 1
24
More Arithmetic
static long x;
static long y; Assume that…
static long z; • x stored in x1
...
• y stored in x2
z = x - y;
z = x * y;
• z stored in x3
z = x / y;
We’ll see later how to
z = x & y;
z = x | y; make this happen
z = x ^ y;
z = x >> y;
Note arithmetic shift!
sub x3, x1, x2 Logical right shift
mul x3, x1, x2 with lsr instruction
sdiv x3, x1, x2
and x3, x1, x2
orr x3, x1, x2
eor x3, x1, x2
asr x3, x1, x2 25
More Arithmetic: Shortcuts
static long x; Assume that…
static long y; • x stored in x1
static long z; • y stored in x2
... • z stored in x3
z = x;
z = -x; We’ll see later how to
make this happen
mov x3, x1
neg x3, x1
These are actually
assembler shortcuts
for instructions with
XZR!
orr x3, xzr, x1
sub x3, xzr, x1
26
Signed vs Unsigned?
Assume that…
static long x;
• x stored in x1
static unsigned long y;
... • y stored in x2
x++;
y--;
27
32-bit Arithmetic
Assume that…
static int length;
• length stored in w1
static int width;
static int perim; • width stored in w2
... • perim stored in w3
perim =
(length + width) * 2;
We’ll see later how to
make this happen
28
8- and 16-bit Arithmetic?
static char x;
static short y;
...
x++;
y--;
No specialized instructions
• Use “w” registers
• Specialized “load” and “store” instructions for transfer of
shorter data types from / to memory – we’ll see these later
• Corresponds to C language semantics: all arithmetic is
implicitly done on (at least) ints
29
Agenda
Language Levels
Architecture
Assembly Language: Performing Arithmetic
Assembly Language: Load/Store and Defining Global Data
30
Loads and Stores
Most basic way to load (from RAM) and store (to RAM):
31
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
add w1, w1, w2
lsl w1, w1, 1
adr x0, perim
str w1, [x0]
mov w0, 0
ret
32
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
Sections add w1, w1, w2
lsl w1, w1, 1
.data: read-write
adr x0, perim
.rodata: read-only str w1, [x0]
.bss: read-write, initialized to zero mov w0, 0
.text: read-only, program code ret
Stack and heap work differently!
33
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
Declaring data add w1, w1, w2
lsl w1, w1, 1
“Labels” for locations in memory
adr x0, perim
str w1, [x0]
.word: 32-bit integer mov w0, 0
ret
34
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
Global symbol add w1, w1, w2
lsl w1, w1, 1
Declare “main” to be a
adr x0, perim
globally-visible label
str w1, [x0]
mov w0, 0
ret
35
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
Generating addresses add w1, w1, w2
lsl w1, w1, 1
adr instruction stores address of
adr x0, perim
a label in a register
str w1, [x0]
mov w0, 0
ret
36
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
Load and store add w1, w1, w2
lsl w1, w1, 1
Use “pointer” in x0 to load from
adr x0, perim
and store to memory
str w1, [x0]
mov w0, 0
ret
37
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
add w1, w1, w2
Registers Memory lsl w1, w1, 1
x0 length 1 adr x0, perim
str w1, [x0]
w1 width 2 mov w0, 0
ret
w2 perim 0
38
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
add w1, w1, w2
Registers Memory lsl w1, w1, 1
x0 length 1 adr x0, perim
str w1, [x0]
w1 1 width 2 mov w0, 0
ret
w2 perim 0
39
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
add w1, w1, w2
Registers Memory lsl w1, w1, 1
x0 length 1 adr x0, perim
str w1, [x0]
w1 1 width 2 mov w0, 0
ret
w2 perim 0
40
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
add w1, w1, w2
Registers Memory lsl w1, w1, 1
x0 length 1 adr x0, perim
str w1, [x0]
w1 1 width 2 mov w0, 0
ret
w2 2 perim 0
41
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
add w1, w1, w2
Registers Memory lsl w1, w1, 1
x0 length 1 adr x0, perim
str w1, [x0]
w1 6 width 2 mov w0, 0
ret
w2 2 perim 0
42
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
add w1, w1, w2
Registers Memory lsl w1, w1, 1
x0 length 1 adr x0, perim
str w1, [x0]
w1 6 width 2 mov w0, 0
ret
w2 2 perim 0
43
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
add w1, w1, w2
Registers Memory lsl w1, w1, 1
x0 length 1 adr x0, perim
str w1, [x0]
w1 6 width 2 mov w0, 0
ret
w2 2 perim 6
44
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
Return value add w1, w1, w2
lsl w1, w1, 1
Passed in register w0
adr x0, perim
str w1, [x0]
mov w0, 0
ret
45
Loads and Stores
static int length = 1; .section .data
static int width = 2; length: .word 1
static int perim = 0; width: .word 2
perim: .word 0
int main() .section .text
{ .global main
perim = main:
(length + width) * 2; adr x0, length
return 0; ldr w1, [x0]
} adr x0, width
ldr w2, [x0]
Return to caller add w1, w1, w2
lsl w1, w1, 1
ret instruction
adr x0, perim
str w1, [x0]
mov w0, 0
ret
46
Defining Data: DATA Section 1
static char c = 'a'; .section ".data"
c:
static short s = 12;
.byte 'a'
static int i = 345; s:
static long l = 6789; .short 12
i:
.word 345
l:
.quad 6789
Notes:
.section instruction (to announce DATA section)
label definition (marks a spot in RAM)
.byte instruction (1 byte)
.short instruction (2 bytes)
.word instruction (4 bytes)
.quad instruction (8 bytes)
47
Defining Data: DATA Section 2
char c = 'a'; .section ".data"
short s = 12; .global c
int i = 345; c: .byte 'a'
long l = 6789; .global s
s: .short 12
.global i
i: .word 345
.global l
l: .quad 6789
Notes:
Can place label on same line as next instruction
.global instruction
48
Defining Data: BSS Section
static char c; .section ".bss"
static short s; c:
static int i; .skip 1
static long l; s:
.skip 2
i:
.skip 4
l:
.skip 8
Notes:
.section instruction (to announce BSS section)
.skip instruction
49
Defining Data: RODATA Section
… .section ".rodata"
…"hello\n"…; helloLabel:
… .string "hello\n"
Notes:
.section instruction (to announce RODATA section)
.string instruction
50
Signed vs Unsigned, 8- and 16-bit
ldrb dest, [src]
ldrh dest, [src]
strb src, [dest]
strh src, [dest]
To learn more
• Study more assembly language examples
• Chapters 2-5 of Pyeatt and Ughetta book
• Study compiler-generated assembly language code
• gcc217 –S somefile.c
52
Appendix
53
Byte Order
AARCH64 is a little endian architecture
• Least significant byte of multi-byte entity
is stored at lowest memory address
• “Little end goes first” 1000 00000101
1001 00000000
The int 5 at address 1000: 1002 00000000
1003 00000000
Some other systems use big endian
• Most significant byte of multi-byte entity
is stored at lowest memory address
• “Big end goes first”
1000 00000000
1001 00000000
The int 5 at address 1000: 1002 00000000
1003 00000101
54
Byte Order Example 1
#include <stdio.h>
int main(void)
{ unsigned int i = 0x003377ff;
unsigned char *p;
int j;
p = (unsigned char *)&i;
for (j = 0; j < 4; j++)
printf("Byte %d: %2x\n", j, p[j]);
}
Byte 0: ff Byte 0: 00
Output on a Byte 1: 77 Output on a Byte 1: 33
little-endian
Byte 2: 33 big-endian Byte 2: 77
machine machine
Byte 3: 00 Byte 3: ff
55
Byte Order Example 2
.section ".data"
Note: foo: .word 1
Flawed code; uses “b” ...
instructions to load from .section ".text"
...
a four-byte memory area adr x0, foo
ldrb w1, [x0]
AARCH64 is little
endian, so what will be
the value in x1?
56
Byte Order Example 3
.section ".data"
Note: foo: .byte 1
Flawed code; uses word ...
instructions to manipulate .section ".text"
...
a one-byte memory area adr x0, foo
ldr w1, [x0]
57