IA 32 Intel 32/64 Bit Architecture
IA 32 Intel 32/64 Bit Architecture
IA 32 Intel 32/64 Bit Architecture
IA‐32
x86
IA‐32 Mode x86‐64
Intel 32/64‐bit Real
Mode
Protected
Mode
V86
Mode
Compatibility
Mode
64‐bit
Mode
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 1 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 2
Paging
8086 segment mapping
Virtual division of address space
SEG × 10h = 20-bit physical base address
Managed by OS for page swapping + address aliasing
SEG = 16‐bit segment selector 0000
LINEAR ADDRESS = 32-bit address seen by OS
segment register const
PHYSICAL ADDRESS = real address in physical memory Selector = index to descriptor table descriptor
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 5 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 6
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 7 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 8
User Segment Registers IA‐32 Segmentation Example
Six user segment registers RAM Offset
16-bit SELECTOR = pointer to memory segment byte FFFFFFFF
32-bit number
byte …
GS selector GS Combination of registers and immediate values
GS
FS selector FS Offset ∈ {00000000, … , FFFFFFFF} byte 00000002
ES selector ES byte 00000001
SS selector SS
FS Maximum segment size = 232 bytes = 4 GB
Byte 00000000
CS selector CS ES
Six defined user segments DS selector DS
RAM
CS 15 0 SS
Linear Address = base address + offset
Code segment
CS
Accessible by instruction fetch
DS, ES, FS, GS Example
DS
Logical Address = 1234:11223344 Offset
Data segments
Accessible by load / store instructions Segment selector = 1234 descriptor table Linear
Typical Segment Register Usage Descriptor Tables
Segment definition
Write 64-bit descriptor → Descriptor Table in RAM
Specify
Base address — linear address of first byte in segment
SS DS = ES Limit — maximum offset into segment (segment size)
= CS Access — segment type + access rights
= SS GDT / LDT / IDT
CS
= FS Global Descriptor Table (GDT) descriptor
= GS Accessible by any task descriptor
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 13 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 14
Segmentation Model Selector Format
Table Index Request Privilege Level
descriptor Index
Global (TI) (RPL)
Descriptor
Table
(GDT)
13‐bits 1 bit 2 bits
descriptor
selector
13-bit index to descriptor table
213 = 23 × 210 = 8 × 1 K
selector GDT Base
(GDTR) 8 K (8192) descriptors per table
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 15 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 16
Descriptor Format Type Fields
8 4 4 8 24 16
User Segments (S = 1) 11
G
10
D
9
0
8
0
7
P
6 5
DPL
4
S = 1
3 2 1
TYPE
0
A
User
Base Access Limit Access Base Limit Access
31 ... 24 11 ... 8 19 ... 16 7 ... 0 23 ... 0 15 ... 0 Code Segment G 0 0 0 P DPL S = 0 TYPE System
63 56 55 52 51 48 47 40 39 16 15 0
Type = 1 C R
11 10 9 8 7 6 5 4 3 2 1 0 C = 1 for Conforming code (in protection scheme)
Access G D 0 0 P DPL S = 1 TYPE A User Segment R = 1 for Readable Code (MOV EAX,CS:EA legal)
G 0 0 0 P DPL S = 0 TYPE System Segment
Data Segment
Base 32-bit segment base address
Type = 0 ED W
20-bit offset limit
Segment Size = [1 + Limit] × (4096)G Bytes ED = 1 for Expand Down (stack segment)
Limit
G= 0 ⇒ Byte Granularity: 220 bytes = 1 MB maximum segment size W = 0 for Read-Only segment
G= 1 ⇒ 4 KB Granularity: 220 × 4 KB = 4 GB maximum segment size
D = 0 ⇒ default 16-bit code + effective address
Code Type
D = 1 ⇒ default 32-bit code + effective address System Segments (S = 0)
P = 1 ⇒ segment in memory
Present
P = 0 ⇒ swapped-out segment LDT Segment
DPL Descriptor Privilege Level Type = 0010
S = 0 ⇒ system segment
System
S = 1 ⇒ user segment Task State Segment
Type Segment type code Active task: Type = 1011
A=0 ⇒ segment not accessed Inactive task: Type = 1001
Access
A = 1 ⇒ access has been accessed
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 17 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 18
Segment Address Translation Control + Status Registers
Control Registers
16-bit selector CR0 CR1 CR2 CR3 CR4
Options Reserved Last Page Fault Directory Base Address Protected Mode Options
13 bits 1 bit 2 bits
index TI RPL
EFLAG Registers
GDT
LDT
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 19 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 20
Linear Address Translation IA‐32 4 KB Paging
10 bits 10 bits 12 bits
directory page table offset
210 page tables × 210 page/page table × 212 bytes/page = 232 bytes Page Directory Base Register
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 21 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 22
Entries in Page Directory and Page Table Translation Lookaside Buffer (TLB)
Linear Address Format Intel 80386 Microprocessor
Linear Address
22 bits
12 bits
directory page offset
10 10 12
Physical Address from page translation is
A = PT × 400000 + n × 1000 + offset
PT × 400000 = PT followed by 22 binary 0’s
n × 1000 = n followed by 12 binary 0’s
22 bits
12 bits
PT n offset
10 10 12
Physical Address ≡ Linear Address
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 27 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 28
Paging Options IA‐32 4 KB Paging
PG (paging)
Bit 31 of CR0
Enables IA-32 page translation
32-bit linear address → 32-bit physical address
32‐bit address
Page Size Extensions (PSE)
Bit 4 of CR4 32‐bit
entry
Enables large page sizes 32‐bit
4 MB pages entry
2 MB pages (with PAE flag set)
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 29 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 30
4 MB Paging 4 MB Page Directory Entry
No middle address field
Directory → single page table — 1024 entries
Directory entry → 4 MB page — 22 bit offset into page
32‐bit
Address
32‐bit entry
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 31 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 32
P6 Physical Address Extension (PAE) PAE — 4 KB Pages
36-bit physical address
32-bit linear address → 36-bit physical address
4 added address lines on CPU I/O bus
236 Bytes = 64 GB address space
36 bit
Address
Modified Directory + Page Table structure 64‐bit entry
Directory Pointer Table (DPT) 2 9 9 12 64‐bit entry
Pointer Dir Table Offset
Top of table hierarchy
Defines 4 page table directories (first 2 bits in linear address)
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 33 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 34
36 bit
Address
64‐bit entry
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 35 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 36
Table Entry — 2 MB Pages Accessing 64 GB Physical Memory
Page Table structure accesses 4 GB of 64 GB address space
Directory Pointer Table (DPT) → 22 = 4 directories
Directory → 29 = 512 page tables
Page table → 29 = 512 pages
Page = 212 bytes
Simultaneous address space = 22 × 29 × 29 × 212 bytes = 232 bytes
64 GB address space ⇒ DPT updates
Address space permits 16 different DPT tables
2(36 – 32) = (64 GB / 4 GB) = 16
4 of 64 possible directories "visible" at any time
Accessing additional 4 GB memory sections
Change base address for DPT
New table defines 4 new directories
Write new entries into DPT
Entries point to new directories
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 37 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 38
New Instructions for IA‐32 Segment‐Level Memory Protection
Instruction Description Segment tables "hide" segments from user
ARPL r/m16,r16 Adjust RPL of r/m16 to not less than RPL of r16
User program knows segment selectors
LAR r16,r/m16 Load Access Rights: r16 ← r/m16 masked by FF00H
Segment base address hidden in descriptor
LSL r16,r/m16 Load: r16 ← segment limit, selector r/m16
LSL r32,r/m32 Load: r32 ← segment limit, selector r/m32 Descriptor hidden in GDT / LDT
SGDT, SIDT m Store GDTR to m, Store IDTR to m GDT / LDT access — privileged machine instruction
SLDT r/m16 Stores segment selector from LDTR in r/m16 Local data / code segments
STR r/m16 Stores segment selector from Task Register in r/m16
VERR r/m16 Set ZF=1 if segment specified with r/m16 can be read
Defined in LDT
VERW r/m16 Set ZF=1 if segment specified with r/m16 can be written LDT selector hidden in Task State Segment (TSS)
CLTS Clears Task Switch flag in CR0 Tasks cannot locate / access
LGDT m16&32 Load m into GDTR TSS, LDT selector, LDT, segment defined in LDT
LIDT m16&32 Load m into IDTR
Hardware denies memory access on
LLDT r/m16 Load segment selector r/m16 into LDTR
LTR r/m16 Load r/m16 into task register Segment overflow
Offset in logical address > segment limit in descriptor
r = register m = memory pointer 16/32 = length in bits Action does not match access type in descriptor
r16={AX, CX, DX, BX, SP, BP, SI, DI} Write to CS / instruction fetch from DS / user access to system segment
r32={EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI}
Insufficient privilege level
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 39 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 40
Privilege Rings Access Rights by Privilege
Segment access parameters Memory access operations by segment type
DPL field in segment descriptor Data segment access
Code performs load / store instruction
Base Access Limit Access Base Limit
Code segment access
11 10 9 8 7 6 5 4 3 2 1 0
Access G D 0 0 P DPL S TYPE A
Code performs jump / call / interrupt instruction
Current code
RPL field in segment selector
Selector in CS → descriptor → code segment containing instruction
Index TI RPL
13-bits 1 bit 2 bits Access rights
Selector in CS → descriptor → DPL in access field Access
Ring Function Granted
Current Privilege Level (CPL) = DPL of current CS Access
0 OS kernel mode Denied
2 Protected user functions Load / store to data segment with DPL < CPL CPL DPL
Most systems use Ring 0 = Kernel Mode and Ring 3 = User Mode DPL
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 41 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 42
System Calls Gate Type System Segments (S = 0)
OS code Defines indirect access
Runs at DPL = 0 Selector → Gate in descriptor table
User code Instead of normal descriptor
Runs at CPL = 3 OS Gate provides new logical address SEG:OFFSET
Cannot jump or call OS code directly 0 1 2 3 Gate privilege permits ring 0 kernel access
user Word Count
Gate mechanism Privileged Stack — user stack segment reserved for system calls
OS advertises CS:EIP for system call call gate Gate call copies 32-bit words from User Stack to Privileged Stack
CS call points to special descriptor in GDT
Similar mechanisms for system call / interrupts / task switch Gate Format
16 8 3 5 16 16
OFFSET access 0 word count SEG OFFSET
System call
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 43 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 44
Call Gate The Trojan Horse Problem
Problem
16 8 3 5 16 16
User program denied access to protected segment
destination offset access 0 word destination selector destination offset data DPL < user CPL
count
User program performs system call
access byte protected
Passes segment selector to OS as pointer data
bit 7 bit 6,5 bits 4,3,2,1,0
P DPL 01100
OS accesses protected data segment
(data DPL ≥ OS CPL)
0 1 2 3
OS
user
system
CS:EIP selector
call
GDT call gate
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 45 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 46
Interrupt Service Hardware Task Creation
Write into Task State Segment (TSS)
CS:EIP Instruction INT n
back link
IDT stacks and stack pointers for CPL = 0, 1, 2
task switch to higher DPL → switch to separate stack
CS Shadow Register (CS descriptor) interrupt gate CR3
destination word destination destination EIP
access 0
offset count selector offset EFLAGS
EAX, ECX, EBX, EDX, ESP, EBP, ESI, EDI
ES, CS, SS, DS, FS, GS
Execute INT n LDT Selector
OS specific information
CS:EIP of next instruction pushed onto stack
Interrupt Gate Write TSS descriptor
Address = IDT base + n × 8 Normal descriptor entered into GDT / LDT
Loaded to CS Shadow Register Points to TSS for task
Selector:offset from Interrupt Gate loaded to CS:EIP
Write Task Gate
CS:EIP = address of ISR (interrupt handler)
Gate descriptor entered into GDT / LDT / IDT
ISR finishes with IRET → pop previous CS:EIP
Destination selector points to TSS descriptor in GDT / LDT
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 47 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 48
Task Switching by Jump Task Switching by Jump
No nesting
CS:EIP JMP selector Back link not set
task GDT
CS Shadow Register (CS descriptor) gate Current code executes JMP to CS:EIP
destination
access 0
word destination destination
TSS descriptor
CS selector → Task Gate in GDT / LDT
offset count selector offset
Descriptor (Task Gate) loaded to CS Shadow Register
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 49 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 50
Context Switch Task Switching by Call / Return Instruction
4
6 Current code executes CALL to CS:EIP
Push CS:EIP of next instruction onto stack
3 CS selector → Task Gate in GDT / LDT
EAX CS CS Descriptor
New TSS Descriptor (Task Gate) loaded to CS Shadow Register
EBX DS DS Descriptor
ECX CPU
SS SS Descriptor
EDX ES ES Descriptor GDT Recognizes descriptor = Task Gate
ESI FS FS Descriptor Copies context of old task to old TSS
EDI GS GS Descriptor Writes old TSS Selector → back link of new TSS
EBP
5
Old TSS Loads Destination Selector → TSS Register
ESP GDT Address GDT Limit
RAM
Selector in TSS Register → TSS descriptor
EIP LDT Selector LDT Descriptor Loads TSS descriptor to TSS Shadow Register
TSS Selector TSS Descriptor Loads context from new TSS to run called task
1
Called task ends with IRET (or preemption)
2 CPU
1. New TSS selector Copies context of new task to new TSS
2. TSS descriptor auto-update Loads back link → TSS Register
3. Auto-save old context
4. Auto-load new context (values for LDT, segment registers, general registers) Selector in TSS Register → old TSS descriptor
5. Auto-update shadow registers for LDT Loads old TSS descriptor → TSS Shadow Register
6. Auto-update shadow registers for CS, DS, SS, ES, FS, GS
Loads context from old TSS → restore old task
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 51 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 52
Real Mode Before Switching To Protected Mode
Start up mode for IA 32 processor OS starts in real mode
Processor runs like fast 8086 Uses 8086 mechanisms
Access only lowest 1 MB of memory Build GDT
OS boot code must be in low memory At least one Data Segment descriptor
Create pseudo-descriptors in shadow registers At least one Code Segment descriptor
Base Address field ← Selector × 10h
Build IDT
Limit ← FFFFh
Convert 32-bit 8086 ISR vectors to 64-bit ISR descriptors
CS access word
G D 0 0 P DPL S CODE C R A
Build TSS for OS scheduler
0 0 0 0 1 00 1 1 0 1 1 Put Task Gate for TSS into GDT
DS, ES, FS, GS access
word
Build Page Tables and Directory
G 0 0 0 P DPL S CODE ED W A
0 0 0 0 1 00 1 0 0 1 1
Linear Address = Physical Addresses
Write Directory Physical Address into TSS
SS access word
G 0 0 0 P DPL S CODE ED W A Load GDT register and IDT register to CPU
0 0 0 0 1 00 1 0 1 1 1
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 53 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 54
Entering Protected Mode Instruction Types
Set flag PE in CR0 New instruction encoding for IA-32
Enter protected mode
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 55 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 56
Example Code Disassemble
ORG 0x100 00000000 66B844332211 mov eax,0x11223344
section .text 00000006 6650 push eax
mov eax,11223344h
push eax
00000008 665B pop ebx
pop ebx 0000000A E80E00 call 0x1b
call disp32 0000000D 66BB88776655 mov ebx,0x55667788
mov ebx,55667788h 00000013 E80500 call 0x1b
call disp32 00000016 B8004C mov ax,0x4c00
mov ax,4C00h 00000019 CD21 int 0x21
int 21h
0000001B B90800 mov cx,0x8
disp32: 0000001E B402 mov ah,0x2
mov cx,08h ; counter = 8 00000020 66C1C304 rol ebx,0x4
mov ah,02h ; DOS function is print byte 00000024 88DA mov dl,bl
nibble: 00000026 80E20F and dl,0xf
rol ebx,4 ; move most significant nibble to least
00000029 80C230 add dl,0x30
mov dl,bl ; load BL to print buffer
and dl,0fh ; zero upper nibble 0000002C 80FA39 cmp dl,0x39
add dl,30h ; ASCII digit range 0000002F 7E03 jng 0x34
cmp dl,39h ; is nibble in [A-F] 00000031 80C207 add dl,0x7
jle go ; if not > 9 print 00000034 CD21 int 0x21
add dl,7h ; if > 9 ASCII letter range 00000036 E2E8 loop 0x20
go: int 21h ; print the byte
loop nibble ; CX-- and continue
00000038 B20D mov dl,0xd
mov dl, 0dh ; CR 0000003A CD21 int 0x21
int 21h 0000003C B20A mov dl,0xa
mov dl, 0ah ; LF 0000003E CD21 int 0x21
int 21h 00000040 C3 ret
ret
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 57 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 58
Example of 32‐bit Address Overrides Assembler Output
00000100 66B861626364 mov eax,0x64636261
ORG 0x100 00000106 66BB41424344 mov ebx,0x44434241
section .data 0000010C 66BD00200000 mov ebp,0x2000
filename db "test.txt",0
00000112 6667894500 mov [ebp+0x0],eax
section .bss
00000117 6667895D04 mov [ebp+0x4],ebx
handle resw 1
section .text 0000011C BA4801 mov dx,0x148
mov eax,'abcd' 0000011F B90000 mov cx,0x0
mov ebx,'ABCD' 00000122 B43C mov ah,0x3c
mov ebp,2000h 00000124 CD21 int 0x21
mov [ebp],eax 00000126 721B jc 0x43
mov [ebp+4],ebx 00000128 A35401 mov [0x154],ax
create: mov dx,filename ; point to file name 0000012B 8B1E5401 mov bx,[0x154]
mov cx,0h ; default attributes 0000012F B90800 mov cx,0x8
mov ah,3ch ; DOS create file 00000132 6689EA mov edx,ebp
int 21h ; DOS system call 00000135 B440 mov ah,0x40
jc end ; stop on error 00000137 CD21 int 0x21
mov [handle],ax ; store file handle 00000139 7208 jc 0x43
write: mov bx,[handle] ; copy file handle to BX 0000013B 8B1E5401 mov bx,[0x154]
mov cx,8h ; write 8 bytes to file 0000013F B43E mov ah,0x3e
mov edx,ebp ; point EDX to buffer
00000141 CD21 int 0x21
mov ah,40h ; DOS write to file
int 21h ; DOS system call 00000143 B8004C mov ax,0x4c00
jc end ; stop on error 00000146 CD21 int 0x21
close: mov bx,[handle] ; copy file handle to BX 00000148 7465 jz 0xaf
mov ah,3eh ; DOS close file 0000014A 7374 jnc 0xc0
int 21h ; DOS system call 0000014C 2E7478 cs jz 0xc7
end: mov ax,4C00h ; return to DOS 0000014F 7400 jz 0x51
int 21h ; DOS system call
C:\nasm\programs>type TEST.TXT
abcdABCD
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 59 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 60
Operating Modes for Intel x86 Processors Why 64 bits?
Features of true 64-bit architecture
x86 64-bit ALU integer operands
64-bit general purpose register set
64-bit flat virtual address space
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 61 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 62
Metric Prefixes Data Types
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 63 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 64
Operand Ranges How to Think About x86‐64 ?
x86-64 not true 64-bit architecture
Optimized for default 32-bit integer
64-bit integer operations by override
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 65 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 66
What Intel Said 64‐bit Extensions
Operands
IA-32 registers → 64-bit width
The move toward 64-bit computing for mainstream applications will initially focus on
EAX, EBX, ECX, EDX, ESP, EBP, ESI, EDI, EIP →
applications that are already constrained by 32-bit memory limitations.
RAX, RBX, RCX, RDX, RSP, RBP, RSI, RDI, RIP
The challenge for IT organizations is to determine the best architecture for specific
8 new general purpose registers (GPR)
solutions, while taking into account total cost and value within the broader IT and
business environments. R8, R9, ... , R15
Default 32-bit integer operand
Itanium architecture remains the platform of choice for the most demanding,
business-critical data tier applications, such as high-end database and business Override → 64-bit integer ALU operations
intelligence solutions. Address
Platforms based on the Intel Xeon processor with Intel EM64T are preferable for 64-bit flat linear address space
general purpose applications, such as Web and mail infrastructure, digital content Logical address = SEG:OFFSET
creation, mechanical computer aided design, and electronic design automation; SEG CS, DS, ES, SS → physical base address = 0
and for mixed environments in which optimized 32-bit performance remains
critical.
64-bit OFFSET = Linear Address
Segmentation enforces protection
The 64-bit Tipping Point, September 2004
64-bit paging
52-bit physical address
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 67 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 68
Summary of Operating Modes 64‐bit Applications — Typical Features
Defaults Typical Code fetch
Operating Register GPR CS → code segment base address = 0
Operating Mode Address Operand
System
Size (bits) Size (bits)
Extensions Width 64-bit instruction pointer RIP
(bits)
Linear Address = RIP
64‐Bit Mode 64 yes 64
32
x86‐64 Compatibility 64‐bit OS 32 32 IA-32 instruction syntax
no
Mode 16 16 16 PUSH, POP, MOV, ADD, SUB, … work as usual
Protected 32 32 32
32‐bit OS Most instructions use 32-bit operands
IA‐32 Mode no
16 16 16 MOV EAX, 11223344h
Real Mode 16‐bit OS
ADD EAX, [EBX+ESI+11223344]
Booting 64-bit OS
64-bit virtual address
Initialize CPU into real mode
Switch CPU to 32-bit protected mode ADD EAX, [EBX+ESI+11223344]
Switch CPU to 64-bit mode Logical Address = DS:EBX+ESI+11223344
OS runs DS → data segment base address = 0
64-bit applications Sign extend EA = EBX+ESI+11223344 → 64-bit EA64
32-bit applications (in compatibility mode) Linear Address = EA64
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 69 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 70
IA‐32 Prefixes Prefixes for Instruction Encoding
IA-32 code prefixes — override default parameters General instruction encoding
66H Legacy REX
Opcode ModR/M SIB Displacement Immediate
16-bit code — 16-bit operand → 32-bit operand prefixes prefix
32-bit code — 32-bit operand → 16-bit operand Legacy prefix = IA-32 prefix
67H ModR/M = mod-reg-r/m
16-bit code — 16-bit address offset → 32-bit address offset SIB = scale-index-base
32-bit code — 32-bit address offset → 16-bit address offset Effective Address = base + scale * index + displacement
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 71 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 72
IA‐32 and REX Overrides Register Access
Default Address Linear Address
Mode Prefix
Size (bits) Size (bits)
64 —
64‐bit 64
32 0x67
32 —
32
16 0x67
Compatibility
32 0x67
16
16 —
Instructions and Operands Other Instructions
Default effective address is 64 bits Branch
EA = Base + Scale * Index + Displacement Near branch — default target address = 64 bits JMP targ
Default operand — 32 bits Far branch — use indirect target JMP [pointer]
ADD EAX,[RBX] Loop instructions check RCX
REX override operand → 64 bits 64 Stack instructions
ADD RAX,[RBX] Operand — 64 bits
Most displacements — 32 bits PUSH RAX
ADD EAX,[RSI+11223344] 32
[RBX] New addressing mode (in kernel mode)
ADD RAX,[RSI+11223344]
RIP-relative
Special form of MOV — 64-bit absolute address
MOV RAX,[1122334455667788] EA = 64-bit RIP + 32-bit displacement
Most immediates — 32 bits String instructions
ADD EAX,11223344 LODSQ ; RAX ← [DS:RSI] , RSI ← RSI + 8
Sign extended immediates for 64-bit operation STOSQ ; [ES:RDI] ← RAX , RDI ← RDI + 8
ADD RAX,11223344 MOVSQ
Special form of MOV — 64-bit immediate CMPSQ
MOV RAX,1122334455667788
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 75 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 76
Segmentation Model Segmentation in x86‐64
Segment selectors
descriptor As in IA-32
Global
Descriptor
Table
(GDT)
descriptor
Descriptor tables
selector Direct descriptor table registers
Global Descriptor Table Register (GDTR)
selector GDT Base
(GDTR) Local Descriptor Table Register (LDTR)
selector
Interrupt Descriptor Table Register (IDTR)
Local Located at 64-bit linear base address
Descriptor
Table
descriptor 10-byte registers
(LDT) 64-bit table base address (8 bytes)
Segment 16-bit table limit (2 bytes)
Base
LDT Base
(LDTR) Segment
Attributes Descriptors
Segment Segment Shadow
Register Register Segment User descriptor structure identical to IA-32
Limit
System descriptors expanded to 128 bits
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 77 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 78
Selector/Descriptor Formats Segmentation Process in 64‐bit Mode
Code segment
Load selector to CS register ⎯ selector points to descriptor
Load descriptor to shadow register
Bit L = 1 ⇒ 64-bit mode (L = 0 ⇒ compatibility mode)
Check DPL = current privilege level
Other descriptor fields ignored
FS and GS
Load selector to FS/GS register ⎯ selector points to descriptor
Load descriptor to shadow register
Shadow register expanded → 64-bit segment base address
Other descriptor fields ignored
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 79 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 80
Canonical Virtual Address Physical Address Extension (PAE) Paging
Virtual (linear) address 4 level paging system for 36-bit physical address
Maximum address length = 64 bits 64-bit Directory Pointer Table entry points to Directory
Minimum implemented address length = 48 bits 64-bit Directory entry points to Page Table / Page
248 bytes = 256 × 210 × 230 = 256 Mega-GB = 256 TB
64-bit Page Table entry points to Page
Procedures for address longer than 48 bits are defined by Intel or AMD
Canonical Form 12/21-bit Offset points to byte in Page
Bit MSB+1 to bit 63 = copy of MSB
Sign-extended format
Splits address space into "positive" and "negative" sectors
Used by OS for system management
63 MSB 48 47 0
Minimum Virtual Address (512 table entries per table) × (64-bits = 8 bytes per entry) = 4 KB / table
Implemented Virtual Address
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 81 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 82
64‐bit Paging 2 MB Page Size in 64‐bit Mode
48 significant bit
canonical address
Table structure
Entries as in PAE
512 entries in Pointer Table (4 in PAE)
Added Page-Map Table at top of hierarchy defines 512 Pointer Tables
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 83 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 84
Table Entries Entering 64‐bit Mode
Directory Directory Directory Page OS running IA-32 protected mode with paging enabled
Bit Page Map
Pointer (4 KB Page) (2 MB Page) Table Disable paging
0 P ⎯ Present / Not Present Enable physical address extensions (PAE)
1 R/W ⎯ Read Only / Writeable
Allows 52-bit physical addresses
2 U/S ⎯ User / Supervisor
3 PWT ⎯ Page Level Write Through Load physical base address of PML4 (top paging table) to CR3
4 PCD ⎯ Page Level Cache Disable Enable x86-64 mode
5 A ⎯ Accessed / Not Accessed
Enable 64-bit paging
6 Reserved D ⎯ Dirty
7 Reserved 0 1 PAT
8 Reserved G ⎯ Global OS now running in 64-bit mode with 64-bit paging
9 – 11 Available to OS
GDTR, LDTR, IDTR, TR still point to IA-32 descriptor tables
12 PAT
Directory Disable exceptions and interrupts
13 Directory Page Table Reserved Page
Pointer
Address Address Page Address Execute LGDT, LLDT, LIDT, and LTR
21 – 39 Address
Address Load physical base addresses to 64-bit descriptor tables
40 – 51 Reserved
52 – 62 Available to OS Enable exceptions and interrupts
63 Reserved
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 85 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 86
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 87 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 88
Address Translation in x86‐64