0% found this document useful (0 votes)
2 views47 pages

3 Assembly Basics

Uploaded by

maeveho25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views47 pages

3 Assembly Basics

Uploaded by

maeveho25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

1

COMPUTER ORGANIZATION AND ARCHITECTURE

PROGRAM REPRESENTATION
ASSEMBLY: BASICS
JIALIANG TAN SPRING 2025
CSE202

OUTLINE
✦ History of Intel family processors
✦ C, Assembly, and Machine code
✦ Assembly Language Basics
✦ Registers and Operands
✦ Data movement instructions
✦ Arithmetic operation instructions
✦ Logical operation instructions
Lehigh University Spring 2025 2
CSE202

LEARNING OUTCOMES
✦ Analyze simple assembly programs
✦ Reverse engineer assembly code into its C code
equivalent
✦ Generate assembly code for a given C code
✦ Write simple assembly programs
✦ Identify security issues related to mixing data
and code

Lehigh University Spring 2025 3


CSE202 ASSEMBLY BASICS
Assembly Language?
✦ Human version of the machine code

✦ Used for ne-grain optimization

✦ Used to identify program vulnerabilities

✦ Intel family (x86-64) machine language


found in laptops, desktops,
supercomputers, and large data centers

Lehigh University Spring 2025 4


fi
CSE202 ASSEMBLY BASICS
Intel Processor Line
1978 1982 1985 1989 1993

8086 80286 i386 i486 Pentium

1995 1997 1999 2000 2004

Pentium/Pro PentiumII PentiumIII Pentium4 Pentium4E


Pentium/MMX

2006 2008 2011 2013 2017

Core 2 Core i7 Core i7 Core i7 Core i9


Nehalem SandyBridge Hashwell
Lehigh University Spring 2025 5
CSE202 ASSEMBLY BASICS
Intel Processor Line
Year Family Transistors Cores
1985 i386 275 K -
1993 Pentuim 3.1 M -
1995 Pentium Pro 5.5 M -
1997 Pentium II 7M -
1999 Pentium III 8.2 M -
2000 Pentium 4 42 M -
2006 Core 2 291 M 2
2008 Core i7 781 M 4
2017 Core i9 ~ 4 - 7B 16
2023 Core i9 24

Lehigh University Spring 2025 6


CSE202 ASSEMBLY BASICS
C - Assembly - Machine code
> gcc -oprog prog1.c prog2.c
prog1.c/prog2.c Pre-processor - cpp

prog1.i/prog2.i
Compiler - cc1

prog1.s/prog2.s
Assembler - as

prog1.o/prog2.o
Disassembler - objdump
Linker - ld
prog (a.out)
Lehigh University Spring 2025 7
CSE202 ASSEMBLY BASICS
C - Assembly - Machine code
✦ Disassembling object code back to assembly
> objdump -d prog.o
✦ Useful to examine object les
✦ Analyze bit patterns of sequence of
instructions
✦ Produces approximate assembly code
✦ Can be run on .o or executable les (a.out)

Lehigh University Spring 2025 8


fi
fi
CSE202 ASSEMBLY BASICS
C-Assembly-Machine code .file "test.c"
.section
.LC0:
.rodata

.string "%d"
.text
.globl main
main:
.LFB0:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movq $20, -4(%rbp)
movq $25, -8(%rbp)
movq -8(%rbp), %rdx
movq -4(%rbp), %rax
movq %rdx, %rsi
> gcc -S test.c movq %rax, %rdi
call sum
movq %rax, %rsi
movq $.LC0, %rdi
movq $0, %rax
#include <stdio.h> call printf
movq $0, %rax
int sum(long, long);
ret
int main(){ sum:
int x = 20; .LFB1:
int y = 25; pushq %rbp
printf("%d",sum(x,y)); movq %rsp, %rbp
return 0; movq %rdi, -4(%rbp)
} movq %rsi, -8(%rbp)
movq -8(%rbp), %rax
int sum(long a, long b){ movq -4(%rbp), %rdx
return a+b; addq %rdx, %rax
} popq %rbp
Lehigh University Spring 2025
test.c ret 9
test.s
CSE202 ASSEMBLY BASICS
C - Assembly - Machine code
C language
int sum(int a, int b){
return a+b;
}
Assembly language
pushq %rbp
movq %rsp, %rbp
movq %rdi, -4(%rbp)
movq %rsi, -8(%rbp)
movq -8(%rbp), %rax
movq -4(%rbp), %rdx
addq %rdx, %rax
popq %rbp
ret
Object code
0 : A0 05
2 : 48 05 04
5 : . . .
16 : 60 02 00
Lehigh University Spring 2025 19 : C3 10
CSE202 ASSEMBLY BASICS
C - Assembly - Machine code
#include <stdio.h> test.c
#include <stdlib.h>

extern int absdiff(int, int);

int main(int argc, char **argv) {


int x = atoi(argv[1]);
int y = atoi(argv[2]);
printf("|%d - %d| = %d\n", x, y, absdiff(x, y));
return 0;
}

.global absdiff absdiff.S


absdiff:
movl %edi, %eax # %eax = x
subl %esi, %eax # %eax = x - y
jge .L1
negl %eax # %eax = -%rax
.L1:
ret University
Lehigh # %eax = |x-y|
Spring 2025 11
CSE202 ASSEMBLY BASICS
C - Assembly - Machine code
#include <stdio.h> test.c
#include <stdlib.h>

extern int absdiff(int, int);

int main(int argc, char **argv) {


int x = atoi(argv[1]);
int y = atoi(argv[2]);
printf("|%d - %d| = %d\n", x, y, absdiff(x, y));
return 0;
}

.global absdiff absdiff.S


absdiff:
movl %edi, %eax # %eax = x
subl %esi, %eax # %eax = x - y > gcc -c -otest.o test.c
jge .L1
negl %eax # %eax = -%rax
> as -oabsdiff.o absdiff.S
.L1: > gcc -otest test.o absdiff.o
ret # %eax = |x-y|
> ./test 10 23
|10 - 23| = 13
> ./test 23 10
|23 - 10| = 13
Lehigh University Spring 2025 12
CSE202 ASSEMBLY BASICS
Instruction Set
✦ Architecture (ISA: Instruction Set Architecture)
✦ The CPU executes the assembly code
(instructions, registers)
✦ Example of ISAs
✦ Intel: x86, IA32, x86-64

✦ ARM: Used in mobile phones

Lehigh University Spring 2025 13


CSE202 ASSEMBLY BASICS
Instruction Set
Data Bus

Data Instructions

Stack
REGISTER FILE PC(RIP) Heap
Address Bus
Data

IR Code
ARITHMETIC
AND LOGIC
UNIT Memory Frame of a
CONTROL program
UNIT
CPU

Lehigh University Spring 2025 14


CSE202 ASSEMBLY BASICS
Instruction Set
✦ Machine instructions perform one of the
following basic operations:
Type Description

Register Transfer Transfer data from one register to another

Transfer data from memory to a register or


Memory Transfer
vice-versa
Arithmetic operations Perform operation using the content of one/
two sources and store the result in one
Logical operations destination
Change the address of the next instruction to
Branch operations
be executed if certain conditions are met

Lehigh University Spring 2025 15


CSE202 ASSEMBLY BASICS

✦ Assembly code

✦ Registers/Memory - Data Storage

✦ Instructions - Manipulation of
operands (data)

Lehigh University Spring 2025 16


CSE202 ASSEMBLY BASICS
Registers
✦ Eight (8) 64-bit main registers - %rax to %rsp

✦ Eight (8) 64-bit extra registers - %r8 to %r15

✦ Byte instructions access the least signi cant byte of


the registers
✦ Word instructions access the 2 least signi cant bytes
✦ Double word instructions access the 4 least
signi cant bytes
✦ Quad word instructions access the entire register

Lehigh University Spring 2025 17


fi
fi
fi
CSE202 ASSEMBLY BASICS
Registers

%rax
63 55 54 48 47 40 39 32 31 24 23 16 15 87 0

%al

%ax

%eax

%rax
Lehigh University Spring 2025 18
CSE202 ASSEMBLY BASICS
Registers
%rax %eax %ax %al

%rbx %ebx %bx %bl

%rcx %ecx %cx %cl

%rdx %edx %dx %dl

%rsi %esi %si %sil

%rdi %edi %di %dil


%rbp %ebp %bp %bpl
%rsp %esp %sp %spl

Lehigh University Spring 2025 19


CSE202 ASSEMBLY BASICS
Registers
%rax Accumulate
%rbx Base
%rcx Counter
%rdx Data

%rsi Source Index


%rdi Destination Index
%rbp Base Pointer

%rsp Stack Pointer

Lehigh University Spring 2025 20


CSE202 ASSEMBLY BASICS
Registers
%rax %eax %ax %al Return value
%rbx %ebx %bx %bl Callee saved
%rcx %ecx %cx %cl 4th argument
%rdx %edx %dx %dl 3rd argument
%rsi %esi %si %sil 2nd argument

%rdi %edi %di %dil 1st argument

%rbp %ebp %bp %bpl Callee saved


%rsp %esp %sp %spl Stack pointer

Lehigh University Spring 2025 21


CSE202 ASSEMBLY BASICS
Registers
%r8 %r8d %r8w %r8b 5th argument

%r9 %r9d %r9w %r9b 6th argument

%r10 %r10d %r10w %r10b Callee saved


%r11 %r11d %r11w %r11b Callee saved
%r12 %r12d %r12w %r12b Callee saved
%r13 %r13d %r13w %r13b Callee saved
%r14 %r14d %r14w %r14b Callee saved

%r15 %r15d %r15w %r15b Callee saved

Lehigh University Spring 2025 22


CSE202 ASSEMBLY BASICS
Operands
Assembly code
C type Intel data type Size (bytes)
suf x

char Byte b 1
short Word w 2
int Double word l 4
long Quad word q 8
char* Quad word q 8
float Single precision s 4
double Double precision l 8
Lehigh University Spring 2025 23
fi
CSE202 ASSEMBLY BASICS
Operands
Instructions may manipulate different types
of operands
✦ Immediate values- constant integer data
($0x40)
✦ Register values - one of the 16 integer
registers (%rax to %r15)
✦ Memory values - up to 8 consecutive
bytes of memory at the address stored in
a register
Lehigh University Spring 2025 24
CSE202 ASSEMBLY BASICS
Operands s: 1, 2, 4, or 8
Type Source Form Operand value

Immediate Immediate $Imm Imm


Register Register ra R[ra]
Absolute Memory Imm M[Imm]
Indirect Memory (ra) M[R[ra]]
Base+
displacement
Memory Imm(rb) M[R[rb]+Imm]
Memory (rb, ri) M[R[rb] + R[ri]]
Indexed
Memory Imm(rb, ri) M[R[rb] + R[ri]+Imm]
Memory (,ri,s) M[R[ri] * s]
Scaled Memory Imm(,ri,s) M[R[ri] * s + Imm]
Indexed Memory (rb, ri, s) M[R[rb] + R[ri] * s]
Memory
Lehigh University
Imm(rb,Spring
ri,2025
s) M[R[rb]+R[ri]*s+Imm]
25
CSE202 ASSEMBLY BASICS
Operands
✦ Given the following values stored in memory
or registers, nd the value of the operands
Address Value Register Value Operand Value
0x100 0xFF %rax 0x100 %rax 0x100
0x104 0xAB %rcx 0x1 0x104 0xAB
0x108 0x13 %rdx 0x3 $0x108 0x108
0x10C 0x11 (%rax) 0xFF
4(%rax) 0xAB
9(%rax,%rdx) 0x11
260(%rcx, %rdx) 0x13
0xFC(,%rcx, 4) 0xFF
Lehigh University Spring 2025(%rax, %rdx, 4) 0x11 26
fi
CSE202 ASSEMBLY BASICS
Data Movement Operations
mov src, dst —> dst = src
Instruction Description

movb Move byte


movw Move word

movl Move double word

movq Move quad word

movl $0x4050,%eax --Immediate to register (4 bytes)


movw %bp,%sp --Register to register (2 bytes)
movb (%rdi,%rcx),%al --Memory to register (1 byte)
movb $-17,(%esp) --Immediate to memory (1 byte)
movq %rax, -12(%rbp) --Register to memory (8 bytes)

Lehigh University Spring 2025 27


CSE202 ASSEMBLY BASICS
Data Movement Operations
✦ movz src,dst—> dst=ZeroExtend(src)

Instruction Description

movzbw Move zero-extended byte to word

movzbl Move zero-extended byte to double word

movzwl Move zero-extended word to double word

movzbq Move zero-extended byte to quad word

movzwq Move zero-extended word to quad word

Lehigh University Spring 2025 28


CSE202 ASSEMBLY BASICS
Data Movement Operations
✦ movs src,dst—> dst=SignExtend(src)
Instruction Description
movsbw Move sign-extended byte to word
movsbl Move sign-extended byte to double word
movswl Move sign-extended word to double word
movsbq Move sign-extended byte to quad word
movswq Move sign-extended word to quad word
movslq Move sign-extended double word to quad word
cltq Move sign-extended %eax to %rax

Lehigh University Spring 2025 29


CSE202 ASSEMBLY BASICS
Data Movement Operations
Practice
Determine the appropriate instruction suf x
mov? %eax,(%rsp)
mov? (%rax),%dx
mov? $0xFF,%bl
mov? (%rsp,%rdx,4),%dl
mov? (%rdx),%rax
mov? %edx,(%rax)

Lehigh University Spring 2025 30

fi
CSE202 ASSEMBLY BASICS
Data Movement Operations
void swap(long *xp, long *yp){
long t0 = *xp;
long t1 = *yp;
*xp = t1;
*yp = t0
} xp in %rdi, yp in %rsi

swap:
movq (%rdi), %rax -- %rax=*xp
movq (%rsi), %rdx -- %rdx=*yp
movq %rdx, (%rdi) -- (%rdi)=*yp
movq %rax, (%rsi) -- (%rsi)=*xp

Lehigh University
ret Spring 2025 31
CSE202 ASSEMBLY BASICS
Data Movement Operations
void swap(long *xp,
long *yp){
0x100 25
long t0 = *xp;
long t1 = *yp; …
*xp = t1;

*yp = t0
} 0x200 75

%rdi = 0x100 Main Memory

%rsi = 0x200
%rax = 25 swap:
movq (%rdi), %rax -- %rax=*xp
%rdx = 75 movq (%rsi), %rdx -- %rdx=*yp
movq %rdx, (%rdi) --(%rdi)=*yp
movq %rax, (%rsi) --(%rsi)=*xp
Lehigh University ret
Spring 2025 32
CSE202 ASSEMBLY BASICS
Data Movement Operations
void swap(long *xp,
long *yp){
0x100 75
long t0 = *xp;
long t1 = *yp; …
*xp = t1;

*yp = t0
} 0x200 25

%rdi = 0x100 Main Memory

%rsi = 0x200
%rax = 25 swap:
movq (%rdi), %rax -- %rax=*xp
%rdx = 75 movq (%rsi), %rdx --%rdx=*yp
movq %rdx, (%rdi) --(%rdi)=*yp
movq %rax, (%rsi) --(%rsi)=*xp
Lehigh University ret
Spring 2025 33
CSE202 ASSEMBLY BASICS
Data Movement Operations
✦ Reverse engineer the assembly code into its
equivalent C code
void decode(long *xp, long*yp, long *zp)
xp in %rdi, yp in %rsi, zp in %rdx
decode:
movq (%rdi),%r8
movq (%rsi),%rcx
movq (%rdx),%rax
movq %r8,(%rsi)
movq %rcx,(%rdx)
movq %rax,(%rdi)
Lehigh University Spring 2025 34
CSE202 ASSEMBLY BASICS
Data Movement Operations
✦ Pushing and Popping (Stack Data)

Top of the stack 0xF8 66

0x100 25 0x100 25

0x108 17 0x108 17

0x110 42 0x110 42

0x118 31 0x118 31

0x120 22 0x120 22

%rsp = 0x100 push(66)


%rsp = 0xF8

Lehigh University Spring 2025 35


CSE202 ASSEMBLY BASICS
Data Movement Operations
✦ Pushing and Popping (Stack Data)

Top of the stack


0x100 25

0x108 17 0x108 17

0x110 42 0x110 42

0x118 31 0x118 31

0x120 22 0x120 22

pop()
%rsp = 0x100
%rsp = 0x108

Lehigh University Spring 2025 36


CSE202 ASSEMBLY BASICS
Data Movement Operations
✦ Pushing and Popping Stack Data

Instruction Effect Equivalent to

R[%rsp] = R[%rsp]-8 subq $8,%rsp


pushq src
M[R[%rsp]] = src movq %rbp,(%rsp)

dst = M[R[%rsp]] movq (%rsp),%rax


popq dst
R[%rsp] = R[%rsp]+8 addq $8, %rsp

Lehigh University Spring 2025 37


CSE202 ASSEMBLY BASICS
Arithmetic and Logical Operations
Operation Type Instruction Effect
Load effective address leaq src,dst dst = src
Increment INC dst dst=dst+1
Decrement DEC dst dst=dst-1
Unary
Negate NEG dst dst=-dst
Complement NOT dst dst=~dst
Add ADD src,dst dst=dst+src
Subtract SUB src,dst dst=dst-src
Multiply IMUL src,dst dst=dst*src
Binary
Exclusive OR XOR src,dst dst=dst^src
OR OR src,dst dst=dst|src
AND AND src,dst dst=dst&src
Lehigh University Spring 2025 38
CSE202 ASSEMBLY BASICS
Arithmetic and Logical Operations

Operation Type Instruction Effect

Left shift SAL k,dst dst=dst<<k

Left shift (same) SHL k,dst dst=dst<<k


Shift
Arithmetic right shift SAR k,dst dst=dst>>k

Logical right shift SHR k,dst dst=dst>>k

K IS AN IMMEDIATE VALUE OR THE REGISTER %CL ONLY

Lehigh University Spring 2025 39


CSE202 ASSEMBLY BASICS
Arithmetic and Logical Operations
long scale(long x, long y, long z){
long t = x + 4 * y + 12 * z
return t;
}
x in %rdi, y in %rsi, z in %rdx

scale:
leaq (%rdi,%rsi, 4), %rax -- %rax = 4*y+x
leaq (%rdx,%rdx, 2), %rdx -- %rdx = 2*z+z=3z
leaq (%rax,%rdx, 4), %rax -- %rax = 12z+4*y+x
ret

Lehigh University Spring 2025 40


CSE202 ASSEMBLY BASICS
Arithmetic and Logical Operations
Address Value Register Value
0x100 0xFF %rax 0x100
0x108 0xAB %rcx 0x1
0x110 0x13 %rdx 0x3
0x118 0x11

Instruction Destination Value


addq %rcx,(%rax) 0x100 0xFF + 1 = 0x100
subq %rdx,8(%rax) 0x108 0xAB - 3 = 0xA8
imulq $16,(%rax,%rdx,8) 0x118 0x11 * 0x10 = 0x110
incq 16(%rax) 0x110 0x13 + 1 = 0x14
decq %rcx %rcx 0x01 - 1 = 0x00
subq %rdx, %rax %rax 0x100 - 0x03 = 0xFD
Lehigh University Spring 2025 41
CSE202 ASSEMBLY BASICS
Arithmetic and Logical Operations
long arith(long x, long y, long z){
long t1 = x^y; long t2 = z*48;
long t3 = t1&0x0F0F0F0F; long t4=t2-t3;
return t4;
}
x in %rdi, y in %rsi, z in %rdx

arith:
xorq %rsi,%rdi #%rdi=x^y(t1)
leaq (%rdx,%rdx, 2),%rax #%rax= z+2z= 3z
salq $4,%rax #%rax= 16*3z= 48z (t2)
andl $252645135, %rdi #%rdi = t1 & 0x0F0F0F0F(t3)
subq %rdi, %rax #%rax = t2 - t3 (t4)
ret #%rax = t4
Lehigh University Spring 2025 42
CSE202 ASSEMBLY BASICS
Arithmetic and Logical Operations
long arith2(long x, long y, long z){
long t1 = ???; long t2 = ???;
long t3 = ???; long t4 = ???;
return t4;
}
x in %rdi, y in %rsi, z in %rdx

arith2:
orq %rsi,%rdi
sarq $3,%rdi
notq %rdi
movq %rdx, %rax
subq %rdi, %rax
ret
Lehigh University Spring 2025 43
CSE202 ASSEMBLY BASICS
Arithmetic and Logical Operations
Type Instruction Effect Description

Signed full
imulq src R[%rdx]:R[%rax] = src * R[%rax]
multiply

Special Unsigned full


- mulq src R[%rdx]:R[%rax] = src * R[%rax]
multiply
128
R[%rdx]:R[%rax] = SignExtend Convert to oct
bits cqto R[%rax] word
-
Oct idivq src
R[%rdx] = R[%rdx]:R[%rax] % src
Signed divide
word R[%rax] = R[%rdx]:R[%rax] / src

R[%rdx] = R[%rdx]:R[%rax] % src Unsigned


divq src R[%rax] = R[%rdx]:R[%rax] / src divide

Lehigh University Spring 2025 44


CSE202 ASSEMBLY BASICS
Arithmetic and Logical Operations
Practice: Write C code for decode that will have an
effect equivalent to the given assembly code
long decode(long x, long y, long z);
x in %rdi, y in %rsi, z in %rdx

decode:
subq %rdx,%rsi
movq %rsi,%rax
imulq %rdi
salq $63, %rax
sarq $63, %rax
xorq %rdi, %rax
ret
Lehigh University Spring 2025 45
CSE202 ASSEMBLY BASICS
Summary

✦ Assembly code - C code - machine code


✦ Registers and operands
✦ Movement instructions
✦ Arithmetic and logical instructions
✦ Next: Control instructions (control)

Lehigh University Spring 2025 46


CSE202 ASSEMBLY BASICS
Practice
Write the de nition of a function wsum in assembly

int wsum(int x, int y, int z)

The function returns the value of the expression 3x+2y+z

Test it in a C program by calling it from the main function


with the arguments x=20, y=25, and z=18.

Lehigh University Spring 2025 47


fi

You might also like