0% found this document useful (0 votes)
19 views59 pages

CSCI 232: Introduction To Assembly

COmputer Organization

Uploaded by

Arfan Ghani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views59 pages

CSCI 232: Introduction To Assembly

COmputer Organization

Uploaded by

Arfan Ghani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

1

CSCI 232
Introduction to Assembly

Reading: B&O 3.1-3.4


How does a computer
interpret and execute C
programs?
4

Learning Assembly

Arithmetic and
Moving data
around logical Control flow Function calls
operations

This Lecture Lecture 8 Lecture 9 Lecture 10


5

Learning Goals
• Learn what assembly language is and why it is important
• Become familiar with the format of human-readable assembly and x86
• Learn the mov instruction and how data moves around at the assembly level
6

Lecture Plan
• Overview: GCC and Assembly
• Looking at an executable
• Registers and The Assembly Level of Abstraction
• The mov Instruction
9
GCC
• GCC is the compiler that converts your human-readable code into machine-
readable instructions.
• C, and other languages, are high-level abstractions we use to write code
efficiently. But computers don’t really understand things like data structures,
variable types, etc. Compilers are the translator!
• Pure machine code is 1s and 0s – everything is bits, even your programs! But
we can read it in a human-readable form called assembly. (Engineers used to
write code in assembly before C).
• There may be multiple assembly instructions needed to encode a single C
instruction.
• We’re going to go behind the curtain to see what the assembly code for our
programs looks like.
AURAK

Intel x86 Processors


• Dominate laptop/desktop/server market

• Evolutionary design
– Backwards compatible up until 8086, introduced in 1978
– Added more features as time goes on

• Complex instruction set computer (CISC)


– Many different instructions with many different formats
• But, only small subset encountered with Linux programs
– Hard to match performance of Reduced Instruction Set
Computers (RISC)
– But, Intel has done just that!
• In terms of speed. Less so for low power.

3
AURAK

Intel x86 Evolution: Milestones


Name Date Transistors MHz
• 8086 1978 29K 5-10
– First 16-bit Intel processor. Basis for IBM PC & DOS
– 1MB address space
• 386 1985 275K 16-33
– First 32 bit Intel processor , referred to as IA32
– Added “flat addressing”, capable of running Unix
• Pentium 4E 2004 125M 2800-3800
– First 64-bit Intel x86 processor, referred to as x86-64
• Core 2 2006 291M 1060-3500
– First multi-core Intel processor
• Core i7 2008 731M 1700-3900
– Four cores
4
AURAK

Intel x86 Processors, cont.


• Machine Evolution
– 386 1985 0.3M
– Pentium 1993 3.1M
– Pentium/MMX 1997 4.5M
– PentiumPro 1995 6.5M
– Pentium III 1999 8.2M
– Pentium 4 2001 42M
– Core 2 Duo 2006 291M
– Core i7 2008 731M
• Added Features
– Instructions to support multimedia operations
– Instructions to enable more efficient conditional operations
– Transition from 32 bits to 64 bits
– More cores

5
AURAK

2015 State of the Art


– Core i7 Broadwell 2015

• Desktop Model
– 4 cores
– Integrated graphics
– 3.3-3.8 GHz
– 65W

• Server Model
– 8 cores
– Integrated I/O
– 2-2.6 GHz
– 45W

6
AURAK

x86 Clones: Advanced Micro Devices


(AMD)
• Historically
–AMD has followed just behind Intel
–A little bit slower, a lot cheaper
• Then
–Recruited top circuit designers from Digital Equipment Corp. and
other downward trending companies
–Built Opteron: tough competitor to Pentium 4
–Developed x86-64, their own extension to 64 bits
• Recent Years
–Intel got its act together
• Leads the world in semiconductor technology
–AMD has fallen behind
• Relies on external semiconductor manufacturer

7
AURAK

Intel’s 64-Bit History


• 2001: Intel Attempts Radical Shift from IA32 to IA64
– Totally different architecture (Itanium)
– Executes IA32 code only as legacy
– Performance disappointing
• 2003: AMD Steps in with Evolutionary Solution
– x86-64 (now called “AMD64”)
• Intel Felt Obligated to Focus on IA64
– Hard to admit mistake or that AMD is better
• 2004: Intel Announces EM64T extension to IA32
– Extended Memory 64-bit Technology
– Almost identical to x86-64!
• All but low-end x86 processors support x86-64
– But, lots of code still runs in 32-bit mode

8
AURAK

Our Coverage
• IA32
– The traditional x86

• x86-64
– The standard
– shark> gcc hello.c

• Presentation
– Book covers x86-64
– We will only cover x86-64

9
Our First Assembly
int sum_array(int arr[], int nelems) { //
function to return sum of elements in an array of size
nelems
int sum = 0; // initialize sum
for (int i = 0; i < nelems; i++) { //
Iterate through all elements and add them to
sum += arr[i]; sum
}
return sum;
}

What does this look like in assembly?

12
Our First Assembly
int sum_array(int arr[], int nelems) {
int sum = 0;
for (int i = 0; i < nelems; i++) {
sum += arr[i];
}
return sum;
} make
objdump -d sum
00000000004005b6 <sum_array>:
4005b6: ba 00 00 00 00 mov $0x0,%edx
4005bb: b8 00 00 00 00 mov $0x0,%eax
4005c0: eb 09 jmp 4005cb <sum_array+0x15>
4005c2: 48 63 ca movslq %edx,%rcx
4005c5: 03 04 8f add (%rdi,%rcx,4),%eax
4005c8: 83 c2 01 add $0x1,%edx
4005cb: 39 f2 cmp %esi,%edx
4005cd: 7c f3 jl 4005c2 <sum_array+0xc>
4005cf: f3 c3 repz retq 13
Our First Assembly
int sum_array(int arr[], int nelems) {
MOVSLQ is move and sign-extend a
int sum = 0; value from a 32-bit source to a 64-bit
for (int i = 0; i < nelems; i++) { destination
sum += arr[i];
}
return sum;
} make
objdump -d sum
00000000004005b6 <sum_array>:
4005b6: ba 00 00 00 00 mov $0x0,%edx
4005bb: b8 00 00 00 00 mov $0x0,%eax
4005c0: eb 09 jmp 4005cb <sum_array+0x15>
4005c2: 48 63 ca movslq %edx,%rcx
4005c5: 03 04 8f add (%rdi,%rcx,4),%eax
4005c8: 83 c2 01 add $0x1,%edx
4005cb: 39 f2 cmp %esi,%edx
4005cd: 7c f3 jl 4005c2 <sum_array+0xc>
4005cf: f3 c3 repz retq 13
Our First Assembly
00000000004005b6 <sum_array>:
4005b6: ba 00 00 00 00 mov $0x0,%edx
4005bb: b8 00 00 00 00 mov $0x0,%eax
4005c0: eb 09 jmp 4005cb <sum_array+0x15>
4005c2: 48 63 ca movslq %edx,%rcx
4005c5: 03 04 8f add (%rdi,%rcx,4),%eax
4005c8: 83 c2 01 add $0x1,%edx
4005cb: 39 f2 cmp %esi,%edx
4005cd: 7c f3 jl 4005c2 <sum_array+0xc>
4005cf: f3 c3 repz retq

14
Our First Assembly
00000000004005b6 <sum_array>:
4005b6: ba 00 00 00 00 mov $0x0,%edx
4005bb: b8 00 00 00 00 mov $0x0,%eax
This is the name
4005c0: ebof09
the function (same jmp 4005cb <sum_array+0x15>
as4005c2:
C) and
4005c5:
the 48 63 ca
memory
03 04 8f
address where movslq %edx,%rcx
add (%rdi,%rcx,4),%eax
the4005c8:
code for this
83function
c2 01 starts. add $0x1,%edx
4005cb: 39 f2 cmp %esi,%edx
4005cd: 7c f3 jl 4005c2 <sum_array+0xc>
4005cf: f3 c3 repz retq

15
Our First Assembly
00000000004005b6 <sum_array>:
4005b6: ba 00 00 00 00 mov $0x0,%edx
4005bb: b8 00 00 00 00 mov $0x0,%eax
4005c0: eb 09 jmp 4005cb <sum_array+0x15>
4005c2: 48These
63 caare the memory movslq
addresses where
%edx,%rcx
4005c5: 03each of the instructionsadd
04 8f live. Sequential
(%rdi,%rcx,4),%eax
4005c8: 83 c2 01 add $0x1,%edx
4005cb:
instructions
39 f2
are sequential
cmp
in memory.
%esi,%edx
4005cd: 7c f3 jl 4005c2 <sum_array+0xc>
4005cf: f3 c3 repz retq

16
Our First Assembly
00000000004005b6 <sum_array>:
4005b6: ba 00 00 00 00 mov $0x0,%edx
4005bb: b8 00 00 00 00 mov $0x0,%eax
4005c0: eb 09 jmp 4005cb <sum_array+0x15>
This is the assembly
4005c2: 48 63code:
ca movslq %edx,%rcx
“human-readable”
4005c5: versions
03 04 8f of add (%rdi,%rcx,4),%eax
4005c8:
each 83 c2instruction.
machine code 01 add $0x1,%edx
4005cb: 39 f2 cmp %esi,%edx
4005cd: 7c f3 jl 4005c2 <sum_array+0xc>
4005cf: f3 c3 repz retq

17
Our First Assembly
00000000004005b6 <sum_array>:
4005b6: ba 00 00 00 00 mov $0x0,%edx
4005bb: b8 00 00 00 00 This
mov is the machine code: raw
$0x0,%eax
4005c0: eb 09 jmp 4005cb <sum_array+0x15>
4005c2: 48 63 ca
hexadecimal instructions,
movslq %edx,%rcx
4005c5: 03 04 8f representing
add binary as read by the
(%rdi,%rcx,4),%eax
4005c8: 83 c2 01 computer.
add Different instructions may
$0x1,%edx
4005cb: 39 f2 cmpdifferent
be %esi,%edx
byte lengths.
4005cd: 7c f3 jl 4005c2 <sum_array+0xc>
4005cf: f3 c3 repz retq

18
Our First Assembly
00000000004005b6 <sum_array>:
4005b6: ba 00 00 00 00 mov $0x0,%edx
4005bb: b8 00 00 00 00 mov $0x0,%eax
4005c0: eb 09 jmp 4005cb <sum_array+0x15>
4005c2: 48 63 ca movslq %edx,%rcx
4005c5: 03 04 8f add (%rdi,%rcx,4),%eax
4005c8: 83 c2 01 add $0x1,%edx
4005cb: 39 f2 cmp %esi,%edx
4005cd: 7c f3 jl 4005c2 <sum_array+0xc>
4005cf: f3 c3 repz retq

19
Our First Assembly
00000000004005b6 <sum_array>:
4005b6: ba 00 00 00 00 mov $0x0,%edx
4005bb: b8 00 00 00 00 mov $0x0,%eax
4005c0: eb 09 jmp 4005cb <sum_array+0x15>
4005c2: 48 63 ca movslq %edx,%rcx
4005c5: 03 04 8f add (%rdi,%rcx,4),%eax
4005c8: 83 c2 01 add $0x1,%edx
4005cb: 39 f2 cmp %esi,%edx
4005cd: 7c f3 jl 4005c2 <sum_array+0xc>
4005cf: f3 c3 repz retq

Each instruction has an


operation name (“opcode”).
20
Our First Assembly
00000000004005b6 <sum_array>:
4005b6: ba 00 00 00 00 mov $0x0,%edx
4005bb: b8 00 00 00 00 mov $0x0,%eax
4005c0: eb 09 jmp 4005cb <sum_array+0x15>
4005c2: 48 63 ca movslq %edx,%rcx
4005c5: 03 04 8f add (%rdi,%rcx,4),%eax
4005c8: 83 c2 01 add $0x1,%edx
4005cb: 39 f2 cmp %esi,%edx
4005cd: 7c f3 jl 4005c2 <sum_array+0xc>
4005cf: f3 c3 repz
Each retq
instruction can also have
arguments (“operands”).

21
Our First Assembly
00000000004005b6 <sum_array>:
4005b6: ba 00 00 00 00 mov $0x0,%edx
4005bb: b8 00 00 00 00 mov $0x0,%eax
4005c0: eb 09 jmp 4005cb <sum_array+0x15>
4005c2: 48 63 ca movslq %edx,%rcx
4005c5: 03 04 8f add (%rdi,%rcx,4),%eax
4005c8: 83 c2 01 add $0x1,%edx
4005cb: 39 f2 cmp %esi,%edx
4005cd: 7c f3 jl 4005c2 <sum_array+0xc>
4005cf: f3 c3 repz retq

$[number] means a constant value,


or “immediate” (e.g. 1 here).
22
Our First Assembly
00000000004005b6 <sum_array>:
4005b6: ba 00 00 00 00 mov $0x0,%edx
4005bb: b8 00 00 00 00 mov $0x0,%eax
4005c0: eb 09 jmp 4005cb <sum_array+0x15>
4005c2: 48 63 ca movslq %edx,%rcx
4005c5: 03 04 8f add (%rdi,%rcx,4),%eax
4005c8: 83 c2 01 add $0x1,%edx
4005cb: 39 f2 cmp %esi,%edx
4005cd: 7c f3 jl 4005c2 <sum_array+0xc>
4005cf: f3 c3 repz retq

%[name] means a register, a storage


location on the CPU (e.g. edx here).
23
Lecture Plan
• Overview: GCC and Assembly
• Looking at an executable
• Registers and The Assembly Level of Abstraction
• The mov instruction
Assembly Abstraction
• C abstracts away the low-level details of machine code. It lets us work using
variables, variable types, and other higher-level abstractions.
• C and other languages let us write code that works on most machines.
• Assembly code is just bytes! No variable types, no type checking, etc.
• Assembly/machine code is processor-specific.

25
Registers

%rax

26
Registers

%rax %rsi %r8 %r12

%rbx %rdi %r9 %r13

%rcx %rbp %r10 %r14

%rdx %rsp %r11 %r15

27
Registers

What is a register?
A register is a fast read/write memory
slot right on the CPU that can hold
variable values.
Registers are not located in memory.
28
Registers
• A register is a 64-bit space inside the processor.
• There are 16 registers available, each with a unique name.
• Registers are like “scratch paper” for the processor. Data being calculated or
manipulated is moved to registers first. Operations are performed on
registers.
• Registers also hold parameters and return values for functions.
• Registers are extremely fast memory!
• Processor instructions consist mostly of moving data into/out of registers and
performing arithmetic on them. This is the level of logic your program must be
in to execute!

29
Machine-Level Code
Assembly instructions manipulate these registers. For example:
• One instruction adds two numbers in registers
• One instruction transfers data from a register to memory
• One instruction transfers data from memory to a register

30
Computer architecture

registers accessed
by name
ALU is main
workhorse of CPU
memory needed
for program
execution
(stack, heap, etc.)
accessed by address

disk/server stores program


when not executing
31
GCC And Assembly
• GCC compiles your program – it lays out memory on the stack and heap and
generates assembly instructions to access and do calculations on those
memory locations.
• Here’s what the “assembly-level abstraction” of C code might look like:

C Assembly Abstraction
1) Copy x into register 1
int sum = x + y; 2) Copy y into register 2
3) Add register 2 to register 1
4) Write register 1 to memory for sum

32
Assembly
• We are going to learn the x86-64 instruction set architecture. This instruction
set is used by Intel and AMD processors.
• There are many other instruction sets: ARM, MIPS, etc.

33
Instruction set architecture (ISA)
A contract between program/compiler and hardware: Application program
• Defines operations that the processor (CPU) can execute Compiler OS
• Data read/write/transfer operations
ISA
• Control mechanisms
CPU design
Circuit design
Intel originally designed their instruction set back in 1978. Chip layout
• Legacy support is a huge issue for x86-64
• Originally 16-bit processor, then 32 bit, now 64 bit.
These design choices dictated the register sizes
(and even register/instruction names).

34
Lecture Plan
• Overview: GCC and Assembly
• Looking at an executable
• Registers and The Assembly Level of Abstraction
• The mov Instruction

35
mov
The mov instruction copies bytes from one place to another;
it is similar to the assignment operator (=) in C.
mov src,dst

The src and dst can each be one of:


• Immediate (constant value, like a number) (only src) $0x104
• Register %rbx
• Memory Location Direct address 0x6005c0
(at most one of src, dst)
36
Operand Forms: Immediate

mov $0x104,_____

Copy the value


0x104 into some
destination.

37
Operand Forms: Registers
Copy the value in
register %rbx into
some destination.

mov %rbx,____

mov ____,%rbx
Copy the value
from some source
into register %rbx.
38
Operand Forms: Absolute Addresses
Copy the value at
address 0x104 into
some destination.

mov 0x104,_____

mov _____,0x104
Copy the value
from some source
into the memory at
address 0x104. 39
Practice #1: Operand Forms
What are the results of the following move instructions (executed separately)?
For this problem, assume the value 5 is stored at address 0x42, and the value 8
is stored in %rbx.

1. mov $0x42,%rax

2. mov 0x42,%rax

3. mov %rbx,0x55

40
Operand Forms: Indirect
Copy the value at the
address stored in register
%rbx into some destination.

mov (%rbx),_____

mov _____,(%rbx)
Copy the value from some source
into the memory at the address
stored in register %rbx. 41
Operand Forms: Base + Displacement
Copy the value at the
address (0x10 plus what is
stored in register %rax) into
some destination.
mov 0x10(%rax),_________

mov __________,0x10(%rax)
Copy the value from some source
into the memory at the address (0x10
plus what is stored in register %rax).42
Operand Forms: Indexed
Copy the value at the address which is
(the sum of the values in registers %rax
and %rdx) into some destination.

mov (%rax,%rdx),__________

mov ___________,(%rax,%rdx)
Copy the value from some source into the
memory at the address which is (the sum of
the values in registers %rax and %rdx). 43
Operand Forms: Indexed
Copy the value at the address which is (the
sum of 0x10 plus the values in registers
%rax and %rdx) into some destination.

mov 0x10(%rax,%rdx),______

mov _______,0x10(%rax,%rdx)
Copy the value from some source into the
memory at the address which is (the sum of 0x10
plus the values in registers %rax and %rdx). 44
Practice #2: Operand Forms
What are the results of the following move instructions (executed separately)?
For this problem, assume the value 0x11 is stored at address 0x10C, 0xAB is
stored at address 0x104, 0x100 is stored in register %rax and 0x3 is stored in
%rdx.

1. mov $0x42,(%rax)
2. mov 4(%rax),%rcx
3. mov 9(%rax,%rdx),%rcx

Imm(rb, ri) is equivalent to address Imm + R[rb] + R[ri]


Displacement: positive or Base: register Index: register
negative constant (if missing, = 0) (if missing, = 0) (if missing, = 0) 45
Operand Forms: Scaled Indexed
Copy the value at the address which
is (4 times the value in register
%rdx) into some destination.

mov (,%rdx,4),______ The scaling factor


(e.g. 4 here) must
be hardcoded to
be either 1, 2, 4
mov _______,(,%rdx,4) or 8.

Copy the value from some source into the


memory at the address which is (4 times
the value in register %rdx). 46
Operand Forms: Scaled Indexed
Copy the value at the address which is
(4 times the value in register %rdx, plus
0x4), into some destination.

mov 0x4(,%rdx,4),______

mov _______,0x4(,%rdx,4)
Copy the value from some source into the
memory at the address which is (4 times
the value in register %rdx, plus 0x4). 47
Operand Forms: Scaled Indexed
Copy the value at the address which is (the
value in register %rax plus 2 times the value in
register %rdx) into some destination.

mov (%rax,%rdx,2),________

mov _________,(%rax,%rdx,2)
Copy the value from some source into the memory at
the address which is (the value in register %rax
plus 2 times the value in register %rdx). 48
Operand Forms: Scaled Indexed
Copy the value at the address which is (0x4 plus the
value in register %rax plus 2 times the value in
register %rdx) into some destination.

mov 0x4(%rax,%rdx,2),_____

mov ______,0x4(%rax,%rdx,2)
Copy the value from some source into the memory at
the address which is (0x4 plus the value in register
%rax plus 2 times the value in register %rdx). 49
Most General Operand Form

Imm(rb,ri,s)

is equivalent to…

Imm + R[rb] + R[ri]*s


50
Most General Operand Form

Imm(rb, ri, s) is equivalent to


address Imm + R[rb] + R[ri]*s
Displacement: Index: register
pos/neg constant (if missing, = 0)
(if missing, = 0) Base: register (if
Scale must be
missing, = 0)
1,2,4, or 8
(if missing, = 1)
51
Recap
• Overview: GCC and Assembly
• Looking at an executable
• Registers and The Assembly Level of Abstraction
• The mov instruction

Next time: diving deeper into assembly

56
Lecture takeaway: Assembly is the human-readable
version of the form our programs are ultimately
executed in by the processor. The compiler translates
source code to machine code. The most common
assembly instruction is mov to move data around.

58
Central Processing Units (CPUs)

Intel 8086, 16-bit


microprocessor
($86.65, 1978)

Raspberry Pi BCM2836 Intel Core i9-9900K 64-bit


32-bit ARM microprocessor 8-core multi-core processor
($35 for everything, 2015) ($449, 2018)
59
Why are we reading assembly?

Assembly
idea C code Machine code
code

Programmer- gcc (compiler+assembler)


generated generated
Main goal: Information retrieval
• We will not be writing assembly! (that’s the compiler’s job)
• Rather, we want to translate the assembly back into our C code.
• Knowing how our C code is converted into machine instructions gives us
insight into how to write more efficient, cleaner code.
62
Extended warmup: Information Synthesis
Spend a few minutes thinking about the main paradigms of the mov instruction.
• What might be the equivalent C-like operation?
• Examples (note %r__ registers are 64-bit):
1. mov $0x0,%rdx
2. mov %rdx,%rcx
3. mov $0x42,(%rdi)
4. mov (%rax,%rcx,8),%rax

🤔
64

You might also like