3.1 Machine Basics
3.1 Machine Basics
Abstraction language
program
(in C)
{int temp;
temp = v[k];
v[k] = v[k+1];
v[k+1] = temp;
}
software
instruction set
hardware
Programmer-Visible State
▪ PC: Program counter ▪ Memory
▪ Address of next instruction ▪ Byte addressable array
▪ Called %rip - “RIP” (x86-64)
▪ Code and user data
▪ Register file ▪ Stack to support procedures
▪ Heavily used program data 16 x 64 bits ▪ No distinguishing between
▪ Condition codes different datatypes, int,
▪ Store status information about most pointers, arrays etc.
recent arithmetic or logical operation
▪ Used for conditional branching
▪ Vector registers 12
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
Object Code
Code for sumstore
Assembler
0x0400595:
0x53
▪ Translates .s into .o
0x48 ▪ Binary encoding of each instruction
0x89 ▪ Nearly-complete image of executable code
0xd3
0xe8
▪ Missing linkages between code in different
0xf2 files
0xff Linker
0xff
0xff ▪ Resolves references between files
• Total of 14 bytes
0x48 ▪ Combines with static run-time libraries
0x89 • Each instruction
E.g., code for malloc, printf
▪
0x03 1, 3, or 5 bytes
0x5b • Starts at address
▪ Some libraries are dynamically linked
0xc3 0x0400595 ▪ Linking occurs when program begins
execution
Disassembler
objdump –d sum
▪ Useful tool for examining object code
▪ Analyzes bit pattern of series of instructions
▪ Produces approximate rendition of assembly code
▪ Can be run on either a.out (complete executable) or .o file
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 17
Carnegie Mellon
Alternate Disassembly
Disassembled
Object
0x0400595:
0x53 Dump of assembler code for function sumstore:
0x48 0x0000000000400595 <+0>: push %rbx
0x89 0x0000000000400596 <+1>: mov %rdx,%rbx
0xd3 0x0000000000400599 <+4>: callq 0x400590 <plus>
0xe8 0x000000000040059e <+9>: mov %rax,(%rbx)
0xf2 0x00000000004005a1 <+12>:pop %rbx
0xff 0x00000000004005a2 <+13>:retq
0xff
0xff
0x48 Within gdb Debugger
0x89 gdb sum
0x03
0x5b disassemble sumstore
0xc3 ▪ Disassemble procedure
x/14xb sumstore
▪ Examine the 14 bytes starting at sumstore
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 18
Carnegie Mellon
No symbols in "WINWORD.EXE".
Disassembly of section .text:
30001000 <.text>:
30001000: 55 push %ebp
30001001: 8b ec mov %esp,%ebp
30001003: 6a ffReverse engineering
push forbidden by
$0xffffffff
30001005: 68Microsoft
90 10 00 End User License
30 push Agreement
$0x30001090
3000100a: 68 91 dc 4c 30 push $0x304cdc91
Example
Example
For most cases, the mov instructions will only update the specific
register bytes or memory locations indicated by the destination operand.
This exception arises from the convention, adopted in x86-64, that any
instruction that generates a 32-bit value for a register also sets the high-
order portion of the register to 0.
The regular movq instruction can only have immediate source operands
that can be represented as 32-bit two’s-complement numbers. This value is then
sign extended to produce the 64-bit value for the destination.
The movabsq instruction can have an arbitrary 64-bit immediate value as its source
operand and can only have a register as a destination.
Practice Problem
Practice Problem
Example
Example
swap:
movq (%rdi), %rax
movq (%rsi), %rdx
movq %rdx, (%rdi)
movq %rax, (%rsi)
ret
void swap
(long *xp, long *yp)
{ swap:
long t0 = *xp; movq (%rdi), %rax
long t1 = *yp; movq (%rsi), %rdx
*xp = t1; movq %rdx, (%rdi)
*yp = t0; movq %rax, (%rsi)
} ret
Understanding Swap()
Memory
void swap Registers
(long *xp, long *yp)
{ %rdi
long t0 = *xp;
%rsi
long t1 = *yp;
*xp = t1; %rax
*yp = t0;
} %rdx
Register Value
%rdi xp
%rsi yp swap:
%rax t0 movq (%rdi), %rax # t0 = *xp
%rdx t1 movq (%rsi), %rdx # t1 = *yp
movq %rdx, (%rdi) # *xp = t1
movq %rax, (%rsi) # *yp = t0
ret
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 42
Carnegie Mellon
Understanding Swap()
Memory
Registers Address
123 0x120
%rdi 0x120
0x118
%rsi 0x100
0x110
%rax 0x108
%rdx 456 0x100
swap:
movq (%rdi), %rax # t0 = *xp
movq (%rsi), %rdx # t1 = *yp
movq %rdx, (%rdi) # *xp = t1
movq %rax, (%rsi) # *yp = t0
ret
Understanding Swap()
Memory
Registers Address
123 0x120
%rdi 0x120
0x118
%rsi 0x100
0x110
%rax 123 0x108
%rdx 456 0x100
swap:
movq (%rdi), %rax # t0 = *xp
movq (%rsi), %rdx # t1 = *yp
movq %rdx, (%rdi) # *xp = t1
movq %rax, (%rsi) # *yp = t0
ret
Understanding Swap()
Memory
Registers Address
123 0x120
%rdi 0x120
0x118
%rsi 0x100
0x110
%rax 123 0x108
%rdx 456 456 0x100
swap:
movq (%rdi), %rax # t0 = *xp
movq (%rsi), %rdx # t1 = *yp
movq %rdx, (%rdi) # *xp = t1
movq %rax, (%rsi) # *yp = t0
ret
Understanding Swap()
Memory
Registers Address
456 0x120
%rdi 0x120
0x118
%rsi 0x100
0x110
%rax 123 0x108
%rdx 456 456 0x100
swap:
movq (%rdi), %rax # t0 = *xp
movq (%rsi), %rdx # t1 = *yp
movq %rdx, (%rdi) # *xp = t1
movq %rax, (%rsi) # *yp = t0
ret
Understanding Swap()
Memory
Registers Address
456 0x120
%rdi 0x120
0x118
%rsi 0x100
0x110
%rax 123 0x108
%rdx 456 123 0x100
swap:
movq (%rdi), %rax # t0 = *xp
movq (%rsi), %rdx # t1 = *yp
movq %rdx, (%rdi) # *xp = t1
movq %rax, (%rsi) # *yp = t0
ret
Shift
Both arithmetic and logical right shifts are
possible. The different shift instructions can specify the shift amount either as
an immediate value or with the single-byte register %cl. (These instructions are
unusual in only allowing this specific register as the operand.)
The imulq instruction has two different forms One form, serves as a “two operand”
multiply instruction, generating a 64-bit product from two 64-bit operands. The
other version is given below: