0% found this document useful (0 votes)
21 views43 pages

Assembly: Arithmetic and Logic: Machine Programming

COmputer Organization

Uploaded by

Arfan Ghani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views43 pages

Assembly: Arithmetic and Logic: Machine Programming

COmputer Organization

Uploaded by

Arfan Ghani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

CSCI232

Assembly: Arithmetic and Logic


Machine Programming
Reading: B&O 3.5-3.6

1
How does a computer
interpret and execute
C programs?
Learning Assembly

Arithmetic and
Moving data
logical Control flow Function calls
around
operations

Week4 This Lecture


Learning Goals
• Learn how to perform arithmetic and logical operations in assembly
• Begin to learn how to read assembly and understand the C code that
generated it
Lecture Plan
• Recap: mov so far
• Data and Register Sizes
• The lea Instruction
• Logical and Arithmetic Operations
• Practice: Reverse Engineering
mov
The mov instruction copies bytes from one place to another;
it is similar to the assignment operator (=) in C.
mov src,dst

The src and dst can each be one of:


• Immediate (constant value, like a number) (only src)

• Register

• Memory Location
(at most one of src, dst)
Memory Location Syntax
Syntax Meaning
0x104 Address 0x104 (no $)
(%rax) What’s in %rax
4(%rax) What’s in %rax, plus 4
(%rax, %rdx) Sum of what’s in %rax and %rdx
4(%rax, %rdx) Sum of values in %rax and %rdx, plus 4
What’s in %rcx, times 4 (multiplier can be 1,
(, %rcx, 4)
2, 4, 8)
(%rax, %rcx, 2) What’s in %rax, plus 2 times what’s in %rcx
What’s in %rax, plus 2 times what’s in %rcx,
8(%rax, %rcx, 2)
plus 8
Lecture Plan
• Recap: mov so far
• Data and Register Sizes
• The lea Instruction
• Logical and Arithmetic Operations
• Practice: Reverse Engineering
Data Sizes
Data sizes in assembly have slightly different terminology to get used to:
• A byte is 1 byte.
• A word is 2 bytes.
• A double word is 4 bytes.
• A quad word is 8 bytes.

Assembly instructions can have suffixes to refer to these sizes:


• b means byte
• w means word
• l means double word
• q means quad word 12
Register Sizes
Bit: 63 31 15 7 0

%rax %eax %ax %al

%rbx %ebx %bx %bl

%rcx %ecx %cx %cl

%rdx %edx %dx %dl

%rsi %esi %si %sil

%rdi %edi %di %dil


Register Sizes
Bit: 63 31 15 7 0

%rbp %ebp %bp %bpl

%rsp %esp %sp %spl

%r8 %r8d %r8w %r8b

%r9 %r9d %r9w %r9b

%r10 %r10d %r10w %r10b

%r11 %r11d %r11w %r11b


Register Sizes
Bit: 63 31 15 7 0

%r12 %r12d %r12w %r12b

%r13 %r13d %r13w %r13b

%r14 %r14d %r14w %r14b

%r15 %r15d %r15w %r15b

15
Register Responsibilities
Some registers take on special responsibilities during program execution.
• %rax stores the return value
• %rdi stores the first parameter to a function
• %rsi stores the second parameter to a function
• %rdx stores the third parameter to a function
• %rip stores the address of the next instruction to execute
• %rsp stores the address of the current top of the stack
mov Variants
• mov can take an optional suffix (b,w,l,q) that specifies the size of data to move:
movb, movw, movl, movq
• mov only updates the specific register bytes or memory locations indicated.
• Exception: movl writing to a register will also set high order 4 bytes to 0.
Practice: mov And Data Sizes
For each of the following mov instructions, determine the appropriate suffix
based on the operands (e.g. movb, movw, movl or movq).

1. mov__ %eax, (%rsp)


2. mov__ (%rax), %dx
3. mov__ $0xff, %bl
4. mov__ (%rsp,%rdx,4),%dl
5. mov__ (%rdx), %rax
6. mov__ %dx, (%rax)
Practice: mov And Data Sizes
For each of the following mov instructions, determine the appropriate suffix
based on the operands (e.g. movb, movw, movl or movq).

1. movl %eax, (%rsp)


2. movw (%rax), %dx
3. movb $0xff, %bl
4. movb (%rsp,%rdx,4),%dl
5. movq (%rdx), %rax
6. movw %dx, (%rax)
mov
• The movabsq instruction is used to write a 64-bit Immediate (constant) value.
• The regular movq instruction can only take 32-bit immediates.
• 64-bit immediate as source, only register as destination.

movabsq $0x0011223344556677, %rax


movz and movs
• There are two mov instructions that can be used to copy a smaller source to a
larger destination: movz and movs.
• movz fills the remaining bytes with zeros
• movs fills the remaining bytes by sign-extending the most significant bit in the
source.
• The source must be from memory or a register, and the destination is a
register.
movz and movs

MOVZ S,R R ← ZeroExtend(S)

Instruction Description
movzbw Move zero-extended byte to word
movzbl Move zero-extended byte to double word
movzwl Move zero-extended word to double word
movzbq Move zero-extended byte to quad word
movzwq Move zero-extended word to quad word
movz and movs

MOVS S,R R ← SignExtend(S)

Instruction Description
movsbw Move sign-extended byte to word
movsbl Move sign-extended byte to double word
movswl Move sign-extended word to double word
movsbq Move sign-extended byte to quad word
movswq Move sign-extended word to quad word
movslq Move sign-extended double word to quad word
cltq Sign-extend %eax to %rax
%rax <- SignExtend(%eax)
Lecture Plan
• Recap: mov so far
• Data and Register Sizes
• The lea Instruction
• Logical and Arithmetic Operations
• Practice: Reverse Engineering
lea
The lea instruction copies an “effective address” from one place to another.
lea src,dst

Unlike mov, which copies data at the address src to the destination, lea copies
the value of src itself to the destination.

The syntax for the destinations is the same as


mov. The difference is how it handles the src.
lea vs. mov
Operands mov Interpretation lea Interpretation
6(%rax), %rdx Go to the address (6 + what’s in %rax), Copy 6 + what’s in %rax into %rdx.
and copy data there into %rdx
lea vs. mov
Operands mov Interpretation lea Interpretation
6(%rax), %rdx Go to the address (6 + what’s in %rax), Copy 6 + what’s in %rax into %rdx.
and copy data there into %rdx
(%rax, %rcx), %rdx Go to the address (what’s in %rax + Copy (what’s in %rax + what’s in %rcx)
what’s in %rcx) and copy data there into into %rdx.
%rdx
lea vs. mov
Operands mov Interpretation lea Interpretation
6(%rax), %rdx Go to the address (6 + what’s in %rax), Copy 6 + what’s in %rax into %rdx.
and copy data there into %rdx
(%rax, %rcx), %rdx Go to the address (what’s in %rax + Copy (what’s in %rax + what’s in %rcx)
what’s in %rcx) and copy data there into into %rdx.
%rdx
(%rax, %rcx, 4), %rdx Go to the address (%rax + 4 * %rcx) and Copy (%rax + 4 * %rcx) into %rdx.
copy data there into %rdx.
lea vs. mov
Operands mov Interpretation lea Interpretation
6(%rax), %rdx Go to the address (6 + what’s in %rax), Copy 6 + what’s in %rax into %rdx.
and copy data there into %rdx
(%rax, %rcx), %rdx Go to the address (what’s in %rax + Copy (what’s in %rax + what’s in %rcx)
what’s in %rcx) and copy data there into into %rdx.
%rdx
(%rax, %rcx, 4), %rdx Go to the address (%rax + 4 * %rcx) and Copy (%rax + 4 * %rcx) into %rdx.
copy data there into %rdx.
7(%rax, %rax, 8), %rdx Go to the address (7 + %rax + 8 * %rax) Copy (7 + %rax + 8 * %rax) into %rdx.
and copy data there into %rdx.

Unlike mov, which copies data at the address


src to the destination, lea copies the value of
src itself to the destination.
Lecture Plan
• Recap: mov so far
• Data and Register Sizes
• The lea Instruction
• Logical and Arithmetic Operations
• Practice: Reverse Engineering
Unary Instructions
The following instructions operate on a single operand (register or memory):
Instruction Effect Description
inc D D ← D + 1 Increment
dec D D ← D - 1 Decrement
neg D D ← -D Negate
not D D ← ~D Complement

Examples: Whenever a register is being referenced with () i.e. (%rax), it means that the register's value
should be taken as a memory address and the value that's in action is the value in that memory
incq 16(%rax) address (also called dereferencing).

dec %rdx
not %rcx
31
Binary Instructions
The following instructions operate on two operands (both can be register or
memory, source can also be immediate). Both cannot be memory locations.
Read it as, e.g. “Subtract S from D”:
Instruction Effect Description
add S, D D ← D + S Add
sub S, D D ← D - S Subtract
imul S, D D ← D * S Multiply
xor S, D D ← D ^ S Exclusive-or
or S, D D ← D | S Or
and S, D D ← D & S And

Examples:
addq %rcx,(%rax)
xorq $16,(%rax, %rdx, 8)
32
subq %rdx,8(%rax)
Lecture Plan
• Recap: mov so far
• Data and Register Sizes
• The lea Instruction
• Logical and Arithmetic Operations
• Practice: Reverse Engineering
Assembly Exercise 1
00000000004005ac <sum_example1>:
4005bd: 8b 45 e8 mov %esi,%eax
4005c3: 01 d0 add %edi,%eax
4005cc: c3 retq

Which of the following is most likely to have generated the above assembly?
// A) // B)
void sum_example1() { int sum_example1(int x, int y) {
int x; return x + y;
int y; }
int sum = x + y;
}
// C)
void sum_example1(int x, int y) {
int sum = x + y;
} A and C does not return a value
42
Assembly Exercise 2
0000000000400578 <sum_example2>:
400578: 8b 47 0c mov 0xc(%rdi),%eax
40057b: 03 07 add (%rdi),%eax
40057d: 2b 47 18 sub 0x18(%rdi),%eax
400580: c3 retq

int sum_example2(int arr[]) { What location or value in the assembly above represents the
int sum = 0; C code’s sum variable?
sum += arr[0];
sum += arr[3];
sum -= arr[6]; %eax
return sum;
} 43
Disassembling Object Code

Disassembler
gcc -g -O -c example2.c
objdump –d example2.o
Useful tool for examining object code
Analyzes bit pattern of series of instructions
Produces approximate rendition of assembly code
Can be run on either a.out(complete executable) or .ofile
Assembly Exercise 3
0000000000400578 <sum_example2>:
400578: 8b 47 0c mov 0xc(%rdi),%eax
40057b: 03 07 add (%rdi),%eax
40057d: 2b 47 18 sub 0x18(%rdi),%eax
400580: c3 retq

int sum_example2(int arr[]) { What location or value in the assembly code above
int sum = 0; represents the C code’s 6 (as in arr[6])?
sum += arr[0];
sum += arr[3];
sum -= arr[6]; 0x18
return sum;
} 44
Recap
• Recap: mov so far
• Data and Register Sizes
• The lea Instruction
• Logical and Arithmetic Operations
• Practice: Reverse Engineering

Next Time: control flow in assembly (while loops, if statements, and more)

46
Lecture takeaway: There are assembly instructions for
arithmetic and logical operations. They share the
same operand form as mov, but lea interprets them
differently. There are also different register sizes that
may be used in assembly instructions.

48
A Note About Operand Forms
• Many instructions share the same address operand forms that mov uses.
• Eg. 7(%rax, %rcx, 2).
• These forms work the same way for other instructions, e.g. sub:
• sub 8(%rax,%rdx),%rcx -> Go to 8 + %rax + %rdx, subtract what’s there from %rcx
• The exception is lea:
• It interprets this form as just the calculation, not the dereferencing
• lea 8(%rax,%rdx),%rcx -> Calculate 8 + %rax + %rdx, put it in %rcx

49
Reverse Engineering 1
int add_to(int x, int arr[], int i) {
int sum = ___?___;
sum += arr[___?___];
return ___?___;
}

----------

add_to:
movslq %edx, %rdx
movl %edi, %eax
addl (%rsi,%rdx,4), %eax
ret

54
Reverse Engineering 1
int add_to(int x, int arr[], int i) {
int sum = ___?___;
sum += arr[___?___];
return ___?___;
}

----------
// x in %edi, arr in %rsi, i in %edx
add_to:
movslq %edx, %rdx // sign-extend i into full register
movl %edi, %eax // copy x into %eax
addl (%rsi,%rdx,4), %eax // add arr[i] to %eax
ret

55
Reverse Engineering 1
int add_to(int x, int arr[], int i) {
int sum = x;
sum += arr[i];
return sum;
}

----------
// x in %edi, arr in %rsi, i in %edx
add_to:
movslq %edx, %rdx // sign-extend i into full register
movl %edi, %eax // copy x into %eax
addl (%rsi,%rdx,4), %eax // add arr[i] to %eax
ret

56
Reverse Engineering 2
int elem_arithmetic(int nums[], int y) {
int z = nums[___?___] * ___?___;
z -= ___?___;
z >>= ___?___;
return ___?___;
}
----------

elem_arithmetic:
movl %esi, %eax
imull (%rdi), %eax
subl 4(%rdi), %eax
sarl $2, %eax
addl $2, %eax
ret
57
Reverse Engineering 2
int elem_arithmetic(int nums[], int y) {
int z = nums[___?___] * ___?___;
z -= ___?___;
z >>= ___?___;
return ___?___;
}
----------
// nums in %rdi, y in %esi
elem_arithmetic:
movl %esi, %eax // copy y into %eax
imull (%rdi), %eax // multiply %eax by nums[0]
subl 4(%rdi), %eax // subtract nums[1] from %eax
sarl $2, %eax // shift %eax right by 2
addl $2, %eax // add 2 to %eax
ret
58
Reverse Engineering 2
int elem_arithmetic(int nums[], int y) {
int z = nums[0] * y;
z -= nums[1];
z >>= 2;
return z + 2;
}
----------
// nums in %rdi, y in %esi
elem_arithmetic:
movl %esi, %eax // copy y into %eax
imull (%rdi), %eax // multiply %eax by nums[0]
subl 4(%rdi), %eax // subtract nums[1] from %eax
sarl $2, %eax // shift %eax right by 2
addl $2, %eax // add 2 to %eax
ret
59

You might also like