0% found this document useful (0 votes)

78 views72 pages

07 Basicx86Architecture 1up

The document provides a brief history of the x86 architecture and Intel processors. It describes how the x86 architecture started with the 8086 processor in 1978 and has evolved over time to become more powerful through the 80386, Pentium, and 64-bit processors. It also discusses how AMD has competed with Intel by following similar advancement and developing the x86-64 extension to 64-bits.

Uploaded by

Anindra Nallapati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views72 pages

07 Basicx86Architecture 1up

Uploaded by

Anindra Nallapati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 72

7: Basic x86 architecture

Computer Architecture and Systems Programming

252-0061-00, Herbstsemester 2013
Timothy Roscoe

1
7.1: What is an instruction set
architecture?
Computer Architecture and Systems Programming
252-0061-00, Herbstsemester 2013
Timothy Roscoe

2
Definitions
• Architecture: (also instruction set architecture: ISA) The
parts of a processor design that one needs to
understand to write assembly code.
Examples:
– instruction set specification, registers.

• Microarchitecture: Implementation of the architecture.

• Examples:
– cache sizes and core frequency.

• Example ISAs: x86, MIPS, ia64, VAX, Alpha, ARM, etc.

3
Instruction Set Architecture
• Assembly Language View Application
– Processor state Program
• Registers, memory, …
– Instructions Compiler OS
• addl, movl, leal, …
• How instructions are encoded as bytes ISA
• Layer of Abstraction
CPU
– Above: how to program machine
Design
• Processor executes instructions in a
sequence
Circuit
– Below: what needs to be built
Design
• Use variety of tricks to make it run fast
• E.g., execute multiple instructions Chip
simultaneously
Layout

4
CISC Instruction Sets
– Complex Instruction Set Computer
– Dominant style through mid-80’s
• Stack-oriented instruction set
– Use stack to pass arguments, save program counter
– Explicit push and pop instructions
• Arithmetic instructions can access memory
– addl %eax, 12(%ebx,%ecx,4)
• requires memory read and write
• Complex address calculation
• Condition codes
– Set as side effect of arithmetic and logical instructions
• Philosophy
– Add instructions to perform “typical” programming tasks

5
RISC Instruction Sets
– Reduced Instruction Set Computer
– Internal project at IBM, later popularized by Hennessy (Stanford)
and Patterson (Berkeley)
• Fewer, simpler instructions
– Might take more to get given task done
– Can execute them with small and fast hardware
• Register-oriented instruction set
– Many more (typically 32) registers
– Use for arguments, return pointer, temporaries
• Only load and store instructions can access memory
– Similar to Y86 mrmovl and rmmovl – see later!
• No Condition codes
– Test instructions return 0/1 in register

6
Contrast with x86 / 64-bit
• Operations are highly uniform
– All encoded in exactly 32 bits
– All take the same time to execute (mostly)
– All operate between registers, or only load/store
– All operate on 64 or 32 bit quantities (nothing
smaller)
• No condition codes: use registers
• Lots of registers, including zero
– All registers are uniform

7
Other RISC features
(not in Alpha)
• Explicit delay slots (e.g. MIPS)
– E.g. can’t use a value until 2 instructions after the load
• Make most instructions conditional (e.g. ARM)
– Needs condition codes (why?)
– Reduces branches, increases code density
• Etc.

• Key message: x86 is not the only way to do this!

8
CISC vs. RISC
• Original Debate
– Strong opinions!
– CISC proponents---easy for compiler, fewer code bytes
– RISC proponents---better for optimizing compilers, can
make run fast with simple chip design
• Current Status
– For desktop processors, choice of ISA not a technical issue
• With enough hardware, can make anything run fast
• Code compatibility more important
– For embedded processors, RISC still makes sense
• Smaller, cheaper, less power
• For how much longer?

9
Comparison with MIPS
(remember Digital Design?)

• MIPS is RISC: Reduced Instruction Set

– Motivation: simpler is faster
• Fewer gates ⇒ higher frequency
• Fewer gates ⇒ more transistors left for cache
– Seemed like a really good idea
• x86 is CISC: Complex Instruction Set
– More complex instructions, addressing modes
• Intel turned out to be way too good at manufacturing
• Difference in gate count became too small to make a
difference
• x86 inside is mostly RISC anyway, decode logic is small
– ⇒ Argument is mostly irrelevant these days
10
There are many architectures…
• You’ve already seen MIPS 2000 → MIPS 3000 → …
– Workstations, minicomputers, now mostly embedded networking
• IBM S/360 → S/370 → … → zSeries
– First to separate architecture from (many) implementations
• ARM (several variants)
– Very common in embedded systems, basis for Advanced OS course at ETHZ
• IBM POWER → PowerPC (→ Cell, sort of)
– Basis for all 3 last-gen games console systems
• DEC Alpha
– Personal favorite; killed by Compaq, team left for Intel to work on…
• Intel Itanium
– First 64-bit Intel product; very fast (esp. FP), hot, and expensive
– Mostly overtaken by 64-bit x86 designs
• etc.

11
Summary
• Architecture vs. Microarchitecture
• Instruction set architectures
• RISC vs. CISC
• x86: comparison with MIPS

12
7.2: A bit of x86 history
Computer Architecture and Systems Programming
252-0061-00, Herbstsemester 2013
Timothy Roscoe

13
Intel x86 Processors
• The x86 Architecture dominates the computer market

• Evolutionary design
– Backwards compatible up until 8086, introduced in 1978
– Added more features as time goes on

• Complex instruction set computer (CISC)

– Many different instructions with many different formats
• But, only small subset encountered with Linux programs
– Hard to match performance of Reduced Instruction Set
Computers (RISC)
– But, Intel has done just that!

14
Intel x86 Evolution: Milestones
Name Date Transistors MHz
• 8086 1978 29K 5-10
– First 16-bit processor. Basis for IBM PC & DOS
– 1MB address space
• 80386 1985 275K 16-33
– First 32 bit processor , referred to as IA32
– Added “flat addressing”
– Capable of running Unix
– 32-bit Linux/gcc uses no instructions introduced in later models
• Pentium 4F 2005 230M 2800-3800
– First 64-bit [x86] processor
– Meanwhile, Pentium 4s (Netburst arch.) phased out in favor of
“Core” line

15
IntelArchitectures
x86 Processors: Overview
Processors
X86-16 8086

286
X86-32/IA32 386
486
Pentium
MMX Pentium MMX

SSE Pentium III

SSE2 Pentium 4

SSE3 Pentium 4E
X86-64 / EM64t Pentium 4F time

Core 2 Duo
SSE4 Core i7
IA: often redefined as latest Intel architecture 16
Intel x86 Processors, contd.
• Machine Evolution
486 1989 1.9M
Pentium 1993 3.1M
Pentium/MMX ‘97 74.5M
PentiumPro 1995 6.5M
Pentium III 1999 8.2M
Pentium 4 2001 42M
Core 2 Duo 2006 291M
• Added Features
– Instructions to support multimedia operations
• Parallel operations on 1, 2, and 4-byte data, both integer & FP
– Instructions to enable more efficient conditional operations

17
x86 Clones: Advanced Micro
Devices (AMD)
• Historically
– AMD has followed just behind Intel
– A little bit slower, a lot cheaper
• Then
– Recruited top circuit designers from Digital Equipment
Corp. and other downward trending companies
– Built Opteron: tough competitor to Pentium 4
– Developed x86-64, their own extension to 64 bits
• Recently
– Intel much quicker with dual core design
– Intel currently far ahead in performance
– em64t backwards compatible to x86-64

18
Intel’s 64-Bit
(partially true…)
• Intel Attempted Radical Shift from IA32 to IA64
– Totally different architecture (Itanium)
– Executes IA32 code only as legacy
– Performance disappointing
• AMD Stepped in with Evolutionary Solution
– x86-64 (now called “AMD64”)
• Intel Felt Obligated to Focus on IA64
– Hard to admit mistake or that AMD is better
• 2004: Intel Announces EM64T extension to IA32
– Extended Memory 64-bit Technology
– Almost identical to x86-64!

19
Intel Nehalem-EX
• Current leader
(for the next few weeks)
– 2.3 billion transistors/die
– 8 or 10 cores per die
– 2 threads per core
– Up to 8 packages
(= 128 contexts!)
– 4 memory channels per package
– Virtualization support
– etc.
• Good illustration of why it is
hard to teach state-of-the-art
processor design!

20
Intel Single-Chip Cloud
Computer - 2010
• Experimental processor
(only a few 100 made)
– Designed for research
– Working version in our Lab
• 48 old-style Pentium cores
• Very fast interconnection
network
– Hardware support for
messaging between cores
– Variable speed of network
• Non-cache coherent
– Sharing memory between
cores won’t work with a
conventional OS!

21
A quick note on syntax
There are two common ways to write x86
Assembler:

• AT&T syntax
– What we'll use in this course, common on Unix
• Intel syntax
– Generally used for Windows machines

22
7.3: Basics of machine code
Computer Architecture and Systems Programming
252-0061-00, Herbstsemester 2013
Timothy Roscoe

23
Assembly programmer’s view
CPU Memory
Addresses
Registers Object Code
PC
Data Program Data
Condition OS Data
Instructions
Codes

Programmer-Visible State Stack

– PC: Program counter
• Address of next instruction
• Called “EIP” (IA32) or “RIP” (x86-64)
– Register file
• Heavily used program data
– Condition codes Memory
• Store status information about most • Byte addressable array
recent arithmetic operation • Code, user data, (some) OS data
• Used for conditional branching
• Includes stack used to support
procedures
24
Compiling into assembly

C code Generated ia32 assembly

int sum(int x, int y) sum:
{ pushl %ebp
int t = x+y; movl %esp,%ebp
return t; movl 12(%ebp),%eax
} addl 8(%ebp),%eax
movl %ebp,%esp
popl %ebp
ret

Obtain with command

gcc -O -S code.c Some compilers use single instruction
“leave”
Produces file code.s

25
Assembly data types
• “Integer” data of 1, 2, or 4 bytes
– Data values
– Addresses (untyped pointers)

• Floating point data of 4, 8, or 10 bytes

• No aggregate types such as arrays or

structures
– Just contiguously allocated bytes in memory
26
Assembly code operations
• Perform arithmetic function on register or
memory data

• Transfer data between memory and register

– Load data from memory into register
– Store register data into memory

• Transfer control
– Unconditional jumps to/from procedures
– Conditional branches

27
Object code
Code for sum
• Assembler
0x401040 <sum>: – Translates .s into .o
0x55
0x89 – Binary encoding of each instruction
0xe5 – Nearly-complete image of
0x8b executable code
0x45
– Missing linkages between code in
0x0c
0x03
different files
0x45 • Linker
0x08 – Resolves references between files
• Total of 13 bytes
0x89
0xec • Each instruction – Combines with static run-time
0x5d 1, 2, or 3 bytes libraries
0xc3 • Starts at address • E.g., code for malloc, printf
0x401040 – Some libraries are dynamically
linked
• Linking occurs when program begins
execution 28
Machine instruction example
• C Code
int t = x+y;
– Add two signed integers
addl 8(%ebp),%eax • Assembly
– Add 2 4-byte integers
Similar to expression: • “Long” words in GCC parlance
• Same instruction whether
x += y signed or unsigned
More precisely: – Operands:
• x: Register %eax
int eax; • y: Memory M[%ebp+8]
int *ebp; • t: Register %eax
eax += ebp[2] – Return function value in %eax
• Object Code
– 3-byte instruction
0x401046: 03 45 08 – Stored at address 0x401046

29
Disassembling object code
Disassembled
00401040 <_sum>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 8b 45 0c mov 0xc(%ebp),%eax
6: 03 45 08 add 0x8(%ebp),%eax
9: 89 ec mov %ebp,%esp
b: 5d pop %ebp
c: c3 ret
d: 8d 76 00 lea 0x0(%esi),%esi

• Disassembler
– objdump -d p
– Useful tool for examining object code
– Analyzes bit pattern of series of instructions
– Produces approximate rendition of assembly code
– Can be run on either a.out (complete executable) or .o file
30
Alternate disassembly
Object Disassembled
0x401040: 0x401040 <sum>: push %ebp
0x55 0x401041 <sum+1>: mov %esp,%ebp
0x89 0x401043 <sum+3>: mov 0xc(%ebp),%eax
0xe5 0x401046 <sum+6>: add 0x8(%ebp),%eax
0x8b 0x401049 <sum+9>: mov %ebp,%esp
0x45 0x40104b <sum+11>: pop %ebp
0x0c 0x40104c <sum+12>: ret
0x03 0x40104d <sum+13>: lea 0x0(%esi),%esi
0x45
0x08 Within gdb Debugger
0x89
0xec – gdb p
0x5d – disassemble sum
0xc3 • Disassemble procedure
– x/13b sum
• Examine the 13 bytes starting at sum
31
What can be disassembled?
% objdump -d WINWORD.EXE

WINWORD.EXE: file format pei-i386

No symbols in "WINWORD.EXE".
Disassembly of section .text:

30001000 <.text>:
30001000: 55 push %ebp
30001001: 8b ec mov %esp,%ebp
30001003: 6a ff push $0xffffffff
30001005: 68 90 10 00 30 push $0x30001090
3000100a: 68 91 dc 4c 30 push $0x304cdc91

• Anything that can be interpreted as executable code

• Disassembler examines bytes and reconstructs assembly source
32
Summary
• Compiling into assembly
• Data types in assembly
• Assembly code operations
• Object code, and disassembling it

33
7.4: 32-bit x86 architecture
Computer Architecture and Systems Programming
252-0061-00, Herbstsemester 2013
Timothy Roscoe

34
Integer registers (ia32)
%eax %ax %ah %al accumulate

%ecx %cx %ch %cl counter

general purpose

%edx %dx %dh %dl data

%ebx %bx %bh %bl base

%esi %si source index

destination
%edi %di index
%esp %sp stack pointer

%ebp %bp base pointer

Origin
16-bit virtual registers (mostly obsolete)
35
(backwards compatibility)
Moving data: ia32 %eax

%ecx

• movx Source, Dest %edx

– x in {b, w, l} %ebx

%esi
– movl Source, Dest: %edi
Move 4-byte “long word”
%esp
– movw Source, Dest:
%ebp
Move 2-byte “word”
– movb Source, Dest:
Move 1-byte “byte”

• Lots of these in typical code

36
Moving data: ia32 %eax

%ecx

movl Source, Dest: %edx

%ebx
• Operand Types
– Immediate: Constant integer data %esi
• Example: $0x400, $-533 %edi
• Like C constant, but prefixed with ‘$’
• Encoded with 1, 2, or 4 bytes %esp
– Register: One of 8 integer registers %ebp
• Example: %eax, %edx
• But %esp and %ebp reserved for special use
• Others have special uses for particular instructions
– Memory: 4 consecutive bytes of memory at address given by
register
• Simplest example: (%eax)
• Various other “address modes”

37
movl operand combinations
Source Dest Src,Dest C Analog

Reg movl $0x4,%eax temp = 0x4;

Imm
Mem movl $-147,(%eax) *p = -147;

Reg movl %eax,%edx temp2 = temp1;

movl Reg
Mem movl %eax,(%edx) *p = temp;

Mem Reg movl (%eax),%edx temp = *p;

Cannot do memory-memory transfer with a single instruction

38
Simple memory
addressing modes
• Normal (R) Mem[Reg[R]]
– Register R specifies memory address

movl (%ecx),%eax

• Displacement D(R) Mem[Reg[R]+D]

– Register R specifies start of memory region
– Constant displacement D specifies offset

movl 8(%ebp),%edx

39
Using simple addressing modes
swap:
pushl %ebp
Set
void swap(int *xp, int *yp) movl %esp,%ebp Up
{ pushl %ebx
int t0 = *xp;
int t1 = *yp; movl 12(%ebp),%ecx
*xp = t1; movl 8(%ebp),%edx
*yp = t0; movl (%ecx),%eax Body
}
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)

movl -4(%ebp),%ebx
movl %ebp,%esp
Finish
popl %ebp
ret
40
Using simple addressing modes
swap:
pushl %ebp
Set
void swap(int *xp, int *yp) movl %esp,%ebp Up
{ pushl %ebx
int t0 = *xp;
int t1 = *yp; movl 12(%ebp),%ecx
*xp = t1; movl 8(%ebp),%edx
*yp = t0; movl (%ecx),%eax Body
}
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)

movl -4(%ebp),%ebx
movl %ebp,%esp
Finish
popl %ebp
ret
41
Understanding swap
void swap(int *xp, int *yp) •
Stack
• (in memory)
{
int t0 = *xp; •
Offset
int t1 = *yp;
*xp = t1; 12 yp
*yp = t0;
} 8 xp
4 Rtn adr
0 Old %ebp %ebp
Register Value
%ecx yp -4 Old %ebx
%edx xp movl 12(%ebp),%ecx # ecx = yp
%eax t1 movl 8(%ebp),%edx # edx = xp
%ebx t0 movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx 42
Understanding swap
%eax
Address
123 0x124
%edx 0x124
456 0x120
%ecx 0x120 0x11c
Register file

%ebx 0x118

%ebx 123 0x118

Memory
%esi
Offset 0x114
yp 12 0x120 0x110
%edi
xp 8 0x124 0x10c
%esp
4 Rtn adr 0x108
%ebp 0x104 %epb → 0 0x104
-4 0x100
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx 45
Complete memory
addressing modes
• Most General Form:
D(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]+ D]
– D: Constant “displacement” 1, 2, or 4 bytes
– Rb: Base register: Any of 8 integer registers
– Ri: Index register: Any, except for %esp
• Unlikely you’d use %ebp, either
– S: Scale: 1, 2, 4, or 8 (why these numbers?)

• Special Cases
(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]]
(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]]
D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D]
D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D]
(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]]
(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]]

46
Address computation examples
%edx 0xf000
%ecx 0x100

Expression Address Computation Address

0x8(%edx) 0xf000 + 0x8 0xf008
(%edx,%ecx) 0xf000 + 0x100 0xf100
(%edx,%ecx,4) 0xf000 + 4*0x100 0xf400
0x80(,%edx,2) 2*0xf000 + 0x80 0x1e080

47
Address computation
instruction
• leal Src,Dest
– Src is address mode expression
– Set Dest to address denoted by expression

• Uses
– Computing addresses without a memory reference
• E.g., translation of p = &x[i];
– Computing arithmetic expressions of the form x + k*y
• k = 1, 2, 4, or 8

48
Summary
• 32-bit x86 registers
• mov instruction: loads and stores
• memory addressing modes
– Example: swap()
• leal: address computation

49
7.5: ia32 integer arithmetic
Computer Architecture and Systems Programming
252-0061-00, Herbstsemester 2013
Timothy Roscoe

50
Some arithmetic operations
• Two operand instructions:
Format Computation
addl Src,Dest Dest ← Dest + Src
subl Src,Dest Dest ← Dest - Src
imull Src,Dest Dest ← Dest * Src
sall Src,Dest Dest ← Dest << Src Also called shll
sarl Src,Dest Dest ← Dest >> Src Arithmetic
shrl Src,Dest Dest ← Dest >> Src Logical
xorl Src,Dest Dest ← Dest ^ Src
andl Src,Dest Dest ← Dest & Src
orl Src,Dest Dest ← Dest | Src

• No distinction between signed and unsigned int (why?)

51
Some arithmetic operations
• One operand instructions
Format Computation
incl Dest Dest ← Dest + 1
decl Dest Dest ← Dest - 1
negl Dest Dest ← -Dest
notl Dest Dest ← ~Dest

• See book for more instructions

52
Using leal for arithmetic expressions
arith:
pushl %ebp Set
int arith movl %esp,%ebp Up
(int x, int y, int z)
{ movl 8(%ebp),%eax
int t1 = x+y; movl 12(%ebp),%edx
int t2 = z+t1; leal (%edx,%eax),%ecx
int t3 = x+4; leal (%edx,%edx,2),%edx
sall $4,%edx Body
int t4 = y * 48;
int t5 = t3 + t4; addl 16(%ebp),%ecx
int rval = t2 * t5; leal 4(%edx,%eax),%eax
return rval; imull %ecx,%eax
}
movl %ebp,%esp
popl %ebp Finish
ret

53
Understanding arith
int arith •
(int x, int y, int z) • Stack
{ Offset •
int t1 = x+y;
int t2 = z+t1; 16 z
z
int t3 = x+4; 12 y
y
int t4 = y * 48; 8 x
x
int t5 = t3 + t4;
int rval = t2 * t5; 4 Rtn adr
return rval; 0 Old
Old %ebp
%ebp %ebp
}
movl 8(%ebp),%eax # eax = x
movl 12(%ebp),%edx # edx = y
leal (%edx,%eax),%ecx # ecx = x+y (t1)
leal (%edx,%edx,2),%edx # edx = 3*y
sall $4,%edx # edx = 48*y (t4)
addl 16(%ebp),%ecx # ecx = z+t1 (t2)
leal 4(%edx,%eax),%eax # eax = 4+t4+x (t5)
imull %ecx,%eax # eax = t5*t2 (rval) 54
Another example
logical:
int logical(int x, int y) pushl %ebp Setup
{ movl %esp,%ebp
int t1 = x^y;
int t2 = t1 >> 17; movl 8(%ebp),%eax
int mask = (1<<13) - 7; xorl 12(%ebp),%eax
Body
int rval = t2 & mask; sarl $17,%eax
return rval; andl $8185,%eax
}
movl %ebp,%esp
popl %ebp Finish
213 = 8192, 213 – 7 = 8185
ret

movl 8(%ebp),%eax # eax = x

xorl 12(%ebp),%eax # eax = x^y (t1)
sarl $17,%eax # eax = t1>>17 (t2)
andl $8185,%eax # eax = t2 & 8185
55
7.6: 64-bit x86 architecture
Computer Architecture and Systems Programming
252-0061-00, Herbstsemester 2013
Timothy Roscoe

56
Data representations:
ia32 and x86-64
C data type Typical 32-bit ia32 Intel x86-64
char 1 1 1
short 2 2 2
int 4 4 4
long 4 4 8
long long 8 8 8
float 4 4 4
double 8 8 8
long double 8 10/12 10/16
char * 4 4 8
(or any other pointer)

Sizes of C objects (in bytes) 57

x86-64 integer registers
%rax %eax %r8 %r8d

%rbx %ebx %r9 %r9d

%rcx %ecx %r10 %r10d

%rdx %edx %r11 %r11d

%rsi %esi %r12 %r12d

%rdi %edi %r13 %r13d

%rsp %esp %r14 %r14d

%rbp %ebp %r15 %r15d

– Extend existing registers. Add 8 new ones.

58
– Make %ebp/%rbp general purpose
Instructions
• Long word l (4 Bytes) ↔ Quad word q (8 Bytes)

• New instructions:
– movl → movq
– addl → addq
– sall → salq
– etc.

• 32-bit instructions that generate 32-bit results

– Set higher order bits of destination register to 0
– Example: addl

59
Swap in 32-bit mode
swap:
void swap(int *xp, int *yp)
pushl %ebp
{
movl %esp,%ebp Setup
int t0 = *xp;
pushl %ebx
int t1 = *yp;
*xp = t1;
movl 12(%ebp),%ecx
*yp = t0;
movl 8(%ebp),%edx
}
movl (%ecx),%eax
Body
movl (%edx),%ebx
movl %eax,(%edx)
movl %ebx,(%ecx)

movl -4(%ebp),%ebx
movl %ebp,%esp
Finish
popl %ebp
ret

60
Swap in 64-bit Mode
void swap(int *xp, int *yp) swap:
{ movl (%rdi), %edx
int t0 = *xp; movl (%rsi), %eax
int t1 = *yp; movl %eax, (%rdi)
*xp = t1; movl %edx, (%rsi)
*yp = t0; retq
}

• Operands passed in registers (why useful?)

– First (xp) in %rdi, second (yp) in %rsi
– 64-bit pointers
• No stack operations required
• 32-bit data
– Data held in registers %eax and %edx
– movl operation
61
Swap Long Ints in 64-bit Mode
void swap_l swap_l:
(long int *xp, long int *yp) movq (%rdi), %rdx
{ movq (%rsi), %rax
long int t0 = *xp; movq %rax, (%rdi)
long int t1 = *yp; movq %rdx, (%rsi)
*xp = t1; retq
*yp = t0;
}

• 64-bit data
– Data held in registers %rax and %rdx
– movq operation
– “q” stands for quad-word

62
7.7: Condition codes
Computer Architecture and Systems Programming
252-0061-00, Herbstsemester 2013
Timothy Roscoe

63
Processor State (ia32, Partial)
• Information about %eax
currently executing
program %ecx

– Temporary data %edx General purpose

( %eax, … ) %ebx registers
– Location of runtime
stack %esi
( %ebp,%esp ) %edi
– Location of current %esp Current stack top
code control point
( %eip, … ) %ebp Current stack frame
– Status of recent
tests %eip Instruction pointer
( CF,ZF,SF,OF)

CF ZF SF OF Condition codes
64
Condition codes (implicit setting)
• Single bit registers
CF Carry Flag (for unsigned) SF Sign Flag (for signed)
ZF Zero Flag OF Overflow Flag (for signed)

• Implicitly set (think of it as side effect) by arithmetic operations

Example: addl/addq Src,Dest ↔ t = a+b
– CF set if carry out from most significant bit (unsigned overflow)
– ZF set if t == 0
– SF set if t < 0 (as signed)
– OF set if two’s complement (signed) overflow
(a>0 && b>0 && t<0) || (a<0 && b<0 && t>=0)

• Not set by lea instruction

• Full documentation link on course website

65
Condition Codes
(Explicit Setting: Compare)
• Explicit Setting by Compare Instruction

cmpl/cmpq Src2,Src1

cmpl b,a like computing a-b without setting destination

CF set if carry out from most significant bit

(used for unsigned comparisons)
ZF set if a == b
SF set if (a-b) < 0 (as signed)
OF set if two’s complement (signed) overflow:
(a>0 && b<0 && (a-b)<0)
|| (a<0 && b>0 && (a-b)>0)

66
Condition Codes
(Explicit Setting: Test)
• Explicit Setting by Test instruction

testl/testq Src2,Src1

testl b,a like computing a&b w/o setting destination

– Sets condition codes based on value of Src1 & Src2

– Useful to have one of the operands be a mask

ZF set when a&b == 0

SF set when a&b < 0

67
Reading Condition Codes
• SetX Instructions
– Set single byte based on combinations of
condition codes

SetX Condition Description

sete ZF Equal / Zero
setne ~ZF Not Equal / Not Zero
sets SF Negative
setns ~SF Nonnegative
setg ~(SFÔF)&~ZF Greater (Signed)
setge ~(SFÔF) Greater or Equal (Signed)
setl (SFÔF) Less (Signed)
setle (SFÔF)|ZF Less or Equal (Signed)
seta ~CF&~ZF Above (unsigned)
setb CF Below (unsigned) 68
Reading Condition Codes (Cont.)
• setx Instructions: %eax %ah %al
Set single byte based on combination of condition %ecx %ch %cl
codes
%edx %dh %dl
• One of 8 addressable byte registers
– Does not alter remaining 3 bytes %ebx %bh %bl
– Typically use movzbl to finish job %esi
int gt (int x, int y) %edi
{
return x > y; %esp
} %ebp

Body

movl 12(%ebp),%eax # eax = y

cmpl %eax,8(%ebp) # Compare x : y
setg %al # al = x > y
movzbl %al,%eax # Zero rest of %eax 69
Reading Condition Codes: x86-64
• setx Instructions:
– Set single byte based on combination of condition codes
– Does not alter remaining 3 bytes

int gt (long x, long y) long lgt (long x, long y)

{ {
return x > y; return x > y;
} }

Body (same for both)

xorl %eax, %eax # eax = 0

cmpq %rsi, %rdi # Compare x and y
setg %al # al = x > y

Is %rax zero?
Yes: 32-bit instructions set high order 32 bits to 0!
70
Jumping
jX Instructions:
Jump to different part of code depending on condition codes

jX Condition Description
jmp 1 Unconditional
je ZF Equal / Zero
jne ~ZF Not Equal / Not Zero
js SF Negative
jns ~SF Non-negative
jg ~(SFÔF)&~ZF Greater (Signed)
jge ~(SFÔF) Greater or Equal (Signed)
jl (SFÔF) Less (Signed)
jle (SFÔF)|ZF Less or Equal (Signed)
ja ~CF&~ZF Above (unsigned)
jb CF Below (unsigned)
71
Summary
• Condition codes (C, Z, S, O)
• Explicit setting of condition codes
– Compare
– Test
• Reading condition codes
– setX
• Jumps

Computer Architecture
No ratings yet
Computer Architecture
667 pages
pdf24 Merged
No ratings yet
pdf24 Merged
225 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
49 pages
Lecture 5
No ratings yet
Lecture 5
68 pages
05 Machine Basics
No ratings yet
05 Machine Basics
44 pages
10 Isa
No ratings yet
10 Isa
27 pages
Asm64 Handout
No ratings yet
Asm64 Handout
46 pages
03 3 Machine Basics
No ratings yet
03 3 Machine Basics
53 pages
03 Machine Basics
No ratings yet
03 Machine Basics
59 pages
Instruction Set Architecture (ISA)
No ratings yet
Instruction Set Architecture (ISA)
14 pages
Week 3 - IsA Appendix A Sp24
No ratings yet
Week 3 - IsA Appendix A Sp24
48 pages
05 Machine Basics
No ratings yet
05 Machine Basics
56 pages
AMS 48 - 2000-n - D0114354 - 055 - 00
100% (1)
AMS 48 - 2000-n - D0114354 - 055 - 00
116 pages
03 IA32Architecture
No ratings yet
03 IA32Architecture
51 pages
Lecture Slides 03 031-Intro-Isa
No ratings yet
Lecture Slides 03 031-Intro-Isa
12 pages
EL3011 - 16 Wrap Up
No ratings yet
EL3011 - 16 Wrap Up
62 pages
Chapter 2 - x86 Processor Architecture
No ratings yet
Chapter 2 - x86 Processor Architecture
46 pages
Lecture Slides 03 031-Intro-Isa
No ratings yet
Lecture Slides 03 031-Intro-Isa
12 pages
Uvm Cookbook
No ratings yet
Uvm Cookbook
385 pages
2 Isa
No ratings yet
2 Isa
32 pages
Lec2 MachineLevelProgramming Basics
No ratings yet
Lec2 MachineLevelProgramming Basics
47 pages
Computer Architecture Basics 1
No ratings yet
Computer Architecture Basics 1
86 pages
Wk05 - CPU Architecture (Part 1)
No ratings yet
Wk05 - CPU Architecture (Part 1)
72 pages
3.1 Machine Basics
No ratings yet
3.1 Machine Basics
55 pages
RISC, CISC, and ISA Variations: Prof. Hakim Weatherspoon CS 3410, Spring 2015
No ratings yet
RISC, CISC, and ISA Variations: Prof. Hakim Weatherspoon CS 3410, Spring 2015
55 pages
ACA Chapter 1
100% (1)
ACA Chapter 1
106 pages
Chapter 4
No ratings yet
Chapter 4
27 pages
Arsitektur Mikroprosessor 32 Bit
No ratings yet
Arsitektur Mikroprosessor 32 Bit
40 pages
4CS015Week6CPUArchitecture 90180
No ratings yet
4CS015Week6CPUArchitecture 90180
39 pages
Defining Computer Architecture & ISA
No ratings yet
Defining Computer Architecture & ISA
11 pages
04-Machineprog 16sp
No ratings yet
04-Machineprog 16sp
22 pages
Coa Concept
No ratings yet
Coa Concept
18 pages
03 IA32Architecture
No ratings yet
03 IA32Architecture
45 pages
Machine-Level Programming I: Basics: 15-213/18-213: Introduction To Computer Systems 5 Lecture, January 30, 2018
No ratings yet
Machine-Level Programming I: Basics: 15-213/18-213: Introduction To Computer Systems 5 Lecture, January 30, 2018
55 pages
Roadmap: Java: C
No ratings yet
Roadmap: Java: C
28 pages
Comparch 03
No ratings yet
Comparch 03
44 pages
Week 2
No ratings yet
Week 2
46 pages
04 Machineprog
No ratings yet
04 Machineprog
22 pages
Bryant and O'Hallaron, Computer Systems: A Programmer's Perspective, Third Edition
No ratings yet
Bryant and O'Hallaron, Computer Systems: A Programmer's Perspective, Third Edition
55 pages
Instruction Set Architecture
No ratings yet
Instruction Set Architecture
45 pages
ECE391 - Ch1 - Basics of Computer Systems
No ratings yet
ECE391 - Ch1 - Basics of Computer Systems
21 pages
E. Mwangosi Micro Processor
No ratings yet
E. Mwangosi Micro Processor
60 pages
2023 S1 IT1020 Lecture 03
No ratings yet
2023 S1 IT1020 Lecture 03
31 pages
Comparch 2015 S 03
No ratings yet
Comparch 2015 S 03
44 pages
Computer Architecture Lec 2
No ratings yet
Computer Architecture Lec 2
13 pages
Chapt 02
No ratings yet
Chapt 02
71 pages
1 Introduction
No ratings yet
1 Introduction
40 pages
Advanced Computer Architecture ECE 6373: Pauline Markenscoff N320 Engineering Building 1 E-Mail: Markenscoff@uh - Edu
No ratings yet
Advanced Computer Architecture ECE 6373: Pauline Markenscoff N320 Engineering Building 1 E-Mail: Markenscoff@uh - Edu
151 pages
MI 1 2 Intel Architecture v3
No ratings yet
MI 1 2 Intel Architecture v3
32 pages
Lacey Design Sheet
No ratings yet
Lacey Design Sheet
6 pages
Bryant and O'Hallaron, Computer Systems: A Programmer's Perspective, Third Edition
No ratings yet
Bryant and O'Hallaron, Computer Systems: A Programmer's Perspective, Third Edition
55 pages
2016defcon Intro To Disassembly Workshop PDF
No ratings yet
2016defcon Intro To Disassembly Workshop PDF
324 pages
Modern Computer Architecture: Lecture1 Fundamentals of Quantitative Design and Analysis (I)
No ratings yet
Modern Computer Architecture: Lecture1 Fundamentals of Quantitative Design and Analysis (I)
41 pages
Product Catalog
No ratings yet
Product Catalog
40 pages
NFL Project Report
50% (2)
NFL Project Report
24 pages
Architecture and Programming of x86 Processors: Microprocessor Techniques and Embedded Systems
No ratings yet
Architecture and Programming of x86 Processors: Microprocessor Techniques and Embedded Systems
24 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
29 pages
Digital Interview Questions
No ratings yet
Digital Interview Questions
5 pages
4 29 01 ISA Part II Annotated 0429
No ratings yet
4 29 01 ISA Part II Annotated 0429
11 pages
BT Toyota CBE Servicemanual
No ratings yet
BT Toyota CBE Servicemanual
104 pages
C Cmos Basics - Ani PDF
No ratings yet
C Cmos Basics - Ani PDF
17 pages
Digital Sequential Circuits
No ratings yet
Digital Sequential Circuits
77 pages
Pentium 4
No ratings yet
Pentium 4
108 pages
Machine-Level Programming I
No ratings yet
Machine-Level Programming I
27 pages
Drill String
No ratings yet
Drill String
47 pages
Wind Turbine Project Report
No ratings yet
Wind Turbine Project Report
19 pages
Material Safety Data Sheet: 1. Chemical Product and Company Identification
No ratings yet
Material Safety Data Sheet: 1. Chemical Product and Company Identification
5 pages
Introduction To Verilog
No ratings yet
Introduction To Verilog
51 pages
FIITJEE - Phase Test (JEE-Advanced) : Physics, Chemistry & Mathematics
100% (1)
FIITJEE - Phase Test (JEE-Advanced) : Physics, Chemistry & Mathematics
11 pages
UADDISSRVolume IIIRoads
No ratings yet
UADDISSRVolume IIIRoads
124 pages
Occupational Health and Safety Management System
No ratings yet
Occupational Health and Safety Management System
5 pages
SQL
0% (1)
SQL
25 pages
Afcons: Electrical Safety Inspection Checklist Re-Construction of North Jetty at Naval Base, Kochi On Epc Contract Basis
No ratings yet
Afcons: Electrical Safety Inspection Checklist Re-Construction of North Jetty at Naval Base, Kochi On Epc Contract Basis
3 pages
Syllabus 1st Year
No ratings yet
Syllabus 1st Year
23 pages
Oracle Dumps
No ratings yet
Oracle Dumps
2 pages
G5562A Global Mixed Mode Technology
100% (2)
G5562A Global Mixed Mode Technology
2 pages
Iso 13938 2 2019
No ratings yet
Iso 13938 2 2019
9 pages
Big-Ip LTM and Tmos Version 10.2
No ratings yet
Big-Ip LTM and Tmos Version 10.2
22 pages
M110 Mech Plan 1
No ratings yet
M110 Mech Plan 1
1 page
First For Apple Mac, Iphone, Ipod and
No ratings yet
First For Apple Mac, Iphone, Ipod and
99 pages
As 2439.1-2007 Perforated Plastics Drainage and Effluent Pipe and Fittings Perforated Drainage Pipe and Assoc
No ratings yet
As 2439.1-2007 Perforated Plastics Drainage and Effluent Pipe and Fittings Perforated Drainage Pipe and Assoc
8 pages
IA 32 Intel 32/64 Bit Architecture
No ratings yet
IA 32 Intel 32/64 Bit Architecture
23 pages
Memory Segmentation: by Nikhil Kumar Nirt Bhopal
No ratings yet
Memory Segmentation: by Nikhil Kumar Nirt Bhopal
11 pages
Anindra Nallapat (33Y/M) Diabetc Profle - Advanced New: Report For Tests Asked
No ratings yet
Anindra Nallapat (33Y/M) Diabetc Profle - Advanced New: Report For Tests Asked
28 pages
SDL03
No ratings yet
SDL03
10 pages
AMBA Protocols Differences
No ratings yet
AMBA Protocols Differences
3 pages
Himadri ASTM
No ratings yet
Himadri ASTM
2 pages
Hallticket No Marks Details Code Sub Name Internals Externals Credits
No ratings yet
Hallticket No Marks Details Code Sub Name Internals Externals Credits
5 pages
Frigider Indesit Tip - Mbze45
No ratings yet
Frigider Indesit Tip - Mbze45
17 pages
SRS
No ratings yet
SRS
4 pages
Savings Account - 32320100005638 Shaik MD Khadeer Ahamed
No ratings yet
Savings Account - 32320100005638 Shaik MD Khadeer Ahamed
4 pages
Chapter 4 - Advanced Programming Techniques: T KQJQ
No ratings yet
Chapter 4 - Advanced Programming Techniques: T KQJQ
5 pages
LCC of Road Pavements
No ratings yet
LCC of Road Pavements
17 pages
DLC Sukhpura
No ratings yet
DLC Sukhpura
3 pages
PlayStation Architecture: Architecture of Consoles: A Practical Analysis, #6
From Everand
PlayStation Architecture: Architecture of Consoles: A Practical Analysis, #6
Rodrigo Copetti
No ratings yet
Xbox Architecture: Architecture of Consoles: A Practical Analysis, #13
From Everand
Xbox Architecture: Architecture of Consoles: A Practical Analysis, #13
Rodrigo Copetti
No ratings yet
GameCube Architecture: Architecture of Consoles: A Practical Analysis, #10
From Everand
GameCube Architecture: Architecture of Consoles: A Practical Analysis, #10
Rodrigo Copetti
No ratings yet
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet
PC Hardware Explained
From Everand
PC Hardware Explained
V. Subhash
No ratings yet

07 Basicx86Architecture 1up

Uploaded by

07 Basicx86Architecture 1up

Uploaded by

7: Basic x86 architecture

Computer Architecture and Systems Programming

• Microarchitecture: Implementation of the architecture.

• Example ISAs: x86, MIPS, ia64, VAX, Alpha, ARM, etc.

• Key message: x86 is not the only way to do this!

• MIPS is RISC: Reduced Instruction Set

• Complex instruction set computer (CISC)

SSE Pentium III

Programmer-Visible State Stack

C code Generated ia32 assembly

Obtain with command

• Floating point data of 4, 8, or 10 bytes

• No aggregate types such as arrays or

• Transfer data between memory and register

WINWORD.EXE: file format pei-i386

• Anything that can be interpreted as executable code

%ecx %cx %ch %cl counter

%edx %dx %dh %dl data

%ebx %bx %bh %bl base

%esi %si source index

%ebp %bp base pointer

• movx Source, Dest %edx

• Lots of these in typical code

movl Source, Dest: %edx

Reg movl $0x4,%eax temp = 0x4;

Reg movl %eax,%edx temp2 = temp1;

Mem Reg movl (%eax),%edx temp = *p;

• Displacement D(R) Mem[Reg[R]+D]

%ebx 123 0x118

Expression Address Computation Address

• No distinction between signed and unsigned int (why?)

• See book for more instructions

movl 8(%ebp),%eax # eax = x

Sizes of C objects (in bytes) 57

%rbx %ebx %r9 %r9d

%rcx %ecx %r10 %r10d

%rdx %edx %r11 %r11d

%rsi %esi %r12 %r12d

%rdi %edi %r13 %r13d

%rsp %esp %r14 %r14d

%rbp %ebp %r15 %r15d

– Extend existing registers. Add 8 new ones.

• 32-bit instructions that generate 32-bit results

• Operands passed in registers (why useful?)

– Temporary data %edx General purpose

• Implicitly set (think of it as side effect) by arithmetic operations

• Not set by lea instruction

cmpl b,a like computing a-b without setting destination

CF set if carry out from most significant bit

testl b,a like computing a&b w/o setting destination

– Sets condition codes based on value of Src1 & Src2

ZF set when a&b == 0

SetX Condition Description

movl 12(%ebp),%eax # eax = y

int gt (long x, long y) long lgt (long x, long y)

Body (same for both)

xorl %eax, %eax # eax = 0

You might also like