IA 32 Intel 32/64 Bit Architecture

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Operating Modes for Intel x86 Processors

IA‐32 
x86

IA‐32 Mode x86‐64

Intel 32/64‐bit Real 
Mode
Protected 
Mode
V86 
Mode
Compatibility 
Mode
64‐bit 
Mode

Architecture 16‐bit  32‐bit  16‐bit 


Application Application Application
32‐bit 
Application
64‐bit 
Application

16‐bit OS 32‐bit OS 64‐bit OS


16‐bit applications not supported 

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 1 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 2

Intel 32‐bit Architecture — IA‐32 Operating Modes 


Instruction Set Architecture for 32-bit Intel processors IA-32 processors initialize into real mode
1985 — now 16-bit integers and address offsets
80386 — Core / Xeon / Centrino Real Mode 16-bit GPRs
Start-up mode AX, BX, CX, DX, SI, DI, BP, SP, IP
Characteristics 8086 features
4 segment registers
Backward compatible with 8086, 80186, and 80286 CS, DS, SS, ES
32-bit integer 20-bit physical address
32-bit physical address Access lowest 1 MB of RAM
232 Bytes = 4 GB of addressable memory 8086 interrupts
32-bit general purpose registers (GPR) 32-bit OS shifts processor into protected mode
EAX, EBX, ECX, EDX, ESP, EBP, ESI, EDI, EIP Windows/Linux/Unix/Mac
Protected Mode
6 segment registers (SR) Full IA-32 features 32-bit GPRs + 6 SRs + 8 system registers
CS, DS, SS, ES, FS, GS Hardware support for OS
Hardware support for operating system Task management
Advanced segmentation model
IA-32 introduced in 1985
Virtual memory and paging management
80386 processor + full Unix implementation Advanced interrupt mechanism
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 3 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 4
IA‐32 Memory Model Logical Address Translation
RAM
Segmentation Logical to linear address
Functional division of address space Linear address = base address + offset
Segment defined by type — Data / Code / System Base address = linear address of first byte in segment
Offset

Access restricted by type


Segment register Linear
LOGICAL ADDRESS = SEGMENT:OFFSET = software address Address

Holds SEGMENT = 16-bit segment SELECTOR Base


Address

Paging
8086 segment mapping
Virtual division of address space
SEG × 10h = 20-bit physical base address
Managed by OS for page swapping + address aliasing
SEG = 16‐bit segment selector 0000
LINEAR ADDRESS = 32-bit address seen by OS
segment register const

Physical address IA-32 segment mapping Table in RAM

PHYSICAL ADDRESS = real address in physical memory Selector = index to descriptor table descriptor

Descriptor = table entry holding descriptor


Address translation descriptor
32-bit base address selector descriptor
Segmentation Unit Paging Unit
Segment size base address descriptor
Logical Address → Linear Address → Physical Address Segment type + access rights descriptor

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 5 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 6

OFFSET =  Effective Address Intel x86 General Register Access


Effective Address in IA-32 16-bit mode
8-bit access 31 16 15 8 7 0
AL, BL, CL, DL, AH AL EAX
AH, BH, CH, DH
AX
16-bit registers EAX
Example AX, BX, CX, DX,
mov eax,[eax+4*edi+11223344h] SI, DI, BP, SP BH BL EBX
On Pentium+ scale = 1, 2, 3, 4, 8 CH CL ECX
32-bit mode DH DL EDX
IA-32 instruction encoding
32-bit registers
EAX, EBX, ESI
ECX, EDX, EDI
ESI, EDI, EBP
EBP, ESP ESP

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 7 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 8
User Segment Registers IA‐32 Segmentation Example
Six user segment registers RAM Offset
16-bit SELECTOR = pointer to memory segment byte FFFFFFFF
32-bit number
byte …
GS selector GS Combination of registers and immediate values
GS
FS selector FS Offset ∈ {00000000, … , FFFFFFFF} byte 00000002
ES selector ES byte 00000001
SS selector SS
FS Maximum segment size = 232 bytes = 4 GB
Byte 00000000
CS selector CS ES
Six defined user segments DS selector DS
RAM
CS 15 0 SS
Linear Address = base address + offset
Code segment
CS
Accessible by instruction fetch
DS, ES, FS, GS Example
DS
Logical Address = 1234:11223344 Offset
Data segments
Accessible by load / store instructions Segment selector = 1234 descriptor table Linear

SS Base address = 00000000 Address


Base
Stack segment Linear Address = 0 + 11223344 = 11223344 Address

Accessible by stack instructions


Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 9 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 10

Typical Segment Register Usage Descriptor Tables
Segment definition
Write 64-bit descriptor → Descriptor Table in RAM
Specify
Base address — linear address of first byte in segment
SS DS = ES Limit — maximum offset into segment (segment size)
= CS Access — segment type + access rights
= SS GDT / LDT / IDT
CS
= FS Global Descriptor Table (GDT) descriptor
= GS Accessible by any task descriptor

Local Descriptor Table (LDT)


DS = ES descriptor
DS
= CS descriptor

= SS Private to task descriptor


ES
Interrupt Descriptor Table (IDT) descriptor

DOS *.com program DOS *.exe program Linux software Accessed on trap / interrupt GS selector  GS descriptor  GS 

One 64 KB segment Four defined segments One 4 GB segment Shadow registers FS selector 


ES selector 
FS descriptor 
ES descriptor 
FS 
ES 
Segment ≤ 64 KB OS allocates memory Descriptor entry SS selector  SS descriptor  SS 
CS selector  CS descriptor  CS 
to programs Copied to CPU from RAM table DS selector  DS descriptor  DS 
15              0  63  0   
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 11 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 12
Task / Process Control in IA‐32 Segmentation Table Registers
IA-32 process allocated Task State Segment (TSS)   
GDT Register (Global Descriptor Table) 
Context for swapped-out process
General register values GDT linear base address  GDT limit 
OS-specific information 31                                                                0  15                                           0 
 
Segment register values IDT Register (Interrupt Descriptor Table) 
Pointer (selector) to LDT via GDT
IDT linear base address  IDT limit 
Status registers 31                                                               0  15                                           0 
TSS selector points to TSS entry via GDT  
LDT Register (Local Descriptor Table)    Shadow Register 
LDT Segment Selector    LDT Segment Descriptor 
TSS1 task1 LDT1 15                                                 0    63                                                                    0 
 
TSS Register (Task State Segment)    Shadow Register 
TSS2 GDT task2 LDT2
TSS Segment Selector    TSS Segment Descriptor 
TSS3 task3 LDT3 15                                                 0    63                                                                    0 
  

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 13 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 14

Segmentation Model Selector Format

Table Index  Request Privilege Level 
descriptor Index 
Global (TI)  (RPL) 
Descriptor
Table
(GDT)
13‐bits  1 bit  2 bits 
descriptor
selector
13-bit index to descriptor table
213 = 23 × 210 = 8 × 1 K
selector GDT Base
(GDTR) 8 K (8192) descriptors per table

selector Descriptor address = Table base address + descriptor offset


Descriptor = 64 bits = 8 bytes
Local
Descriptor Descriptor offset = Index × 8 = Index × 10002 = Selector AND FFF8
descriptor
Table
(LDT) 8 K Descriptors/table × 8 Bytes/Descriptor = 64 KB/table
Segment
Base
Table Index
LDT Base TI = 0 for GDT
(LDTR) Segment TI = 1 for LDT
Attributes
Segment Segment Shadow
Segment Request Privilege Level (RPL)
Register Register
Limit Copy of user privilege level when selector passed as pointer

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 15 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 16
Descriptor Format Type Fields
8  4  4  8  24  16 
User Segments (S = 1) 11 

10 







6   5 
DPL 

S = 1 
3  2  1 
TYPE 


 
User  
Base  Access  Limit  Access  Base  Limit  Access 
31 ... 24  11 ... 8  19 ... 16  7 ... 0  23 ... 0  15 ... 0  Code Segment G  0  0  0  P  DPL  S = 0  TYPE  System  
63  56  55  52  51  48  47  40  39  16  15  0 
Type = 1 C R
11  10  9  8  7  6   5  4  3  2  1  0    C = 1 for Conforming code (in protection scheme)
Access  G  D  0  0  P  DPL  S = 1  TYPE  A  User Segment   R = 1 for Readable Code (MOV EAX,CS:EA legal)
G  0  0  0  P  DPL  S = 0  TYPE  System Segment 
Data Segment
Base 32-bit segment base address
Type = 0 ED W
20-bit offset limit
Segment Size = [1 + Limit] × (4096)G Bytes ED = 1 for Expand Down (stack segment)
Limit
G= 0 ⇒ Byte Granularity: 220 bytes = 1 MB maximum segment size W = 0 for Read-Only segment
G= 1 ⇒ 4 KB Granularity: 220 × 4 KB = 4 GB maximum segment size
D = 0 ⇒ default 16-bit code + effective address
Code Type
D = 1 ⇒ default 32-bit code + effective address System Segments (S = 0)
P = 1 ⇒ segment in memory
Present
P = 0 ⇒ swapped-out segment LDT Segment
DPL Descriptor Privilege Level Type = 0010
S = 0 ⇒ system segment
System
S = 1 ⇒ user segment Task State Segment
Type Segment type code Active task: Type = 1011
A=0 ⇒ segment not accessed Inactive task: Type = 1001
Access
A = 1 ⇒ access has been accessed
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 17 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 18

Segment Address Translation  Control + Status Registers
Control Registers
16-bit selector CR0 CR1 CR2 CR3 CR4
Options Reserved Last Page Fault Directory Base Address Protected Mode Options
13 bits 1 bit 2 bits

index TI RPL

EFLAG Registers

GDT

LDT

base access Address


limit access base limit 64-bit descriptor

32-bit segment base address


32-bit linear address
+
32-bit offset

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 19 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 20
Linear Address Translation IA‐32 4 KB Paging
10 bits  10 bits  12 bits 
directory  page table  offset 
   

directory = index into directory table


32‐bit address
210 =1024 = 1K page table entries per directory
4 bytes per entry ⇒ 4 KB per directory
32‐bit 
entry
page table = index into selected page table 32‐bit 
4 bytes per entry ⇒ 4 KB per page table entry

offset = index (of byte) into selected page


212 = 4096 = 4 KB per page

210 page tables × 210 page/page table × 212 bytes/page = 232 bytes Page Directory Base Register

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 21 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 22

Entries in Page Directory and Page Table Translation Lookaside Buffer (TLB) 

Upper 20 bits of Physical Address  OS reserved  G  PS  D  A  0  U/S  R/W  P  linear address physical address

31  12  11  9  8  7  6  5  4 3  2  1  0 


Address Cache
 
Pages aligned on 4 KB boundaries   Saves 32 linear address → physical address translations
Start address = page number × 1000h 
  12 lower bits of start address = 000h  CPU makes two accesses in parallel
  Upper 20 bits of address = Page Number  Usual Directory/PT/Page translation
D — dirty bit  A — accessed  P = 0 ⇒ swapped‐out P = 1 ⇒ present  Access to TLB
U/S = 0 ⇒ supervisor (kernel mode) page  PS — page size option
R/W permission for user mode pages  G — global option If linear address in TLB
R/W permission for supervisor data pages  TLB responds first and cancels translation
U/S = 1 ⇒ user mode page  TLB catches 98 - 99% of linear address accesses
No access to supervisor data pages 
Read‐only access to data pages with R/W = 0 
Read/write access to data pages with R/W = 1  
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 23 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 24
Logical Address to Physical Address Building Page Tables
cache descriptor in CPU Start with Physical Address = Linear Address
Physical address ≠ linear address after swapping
segment shadow register
Build sequential tables — define sequential pages
index descriptor
descriptor Page Table 0
SEGMENT:OFFSET table
Linear Base Address Starts on page boundary at address p_start
+
Offset Table entry = 32 bits = 4 bytes
paging enabled Linear Address Entries / table = 1 K = 400h  ⇒ table size = 4 KB = 1000h bytes
Page Table PT 
directory page offset paging not enabled Starts at address p_start + PT × 1000
Entry n address = p_start + PT × 1000 + n × 4
Physical Address Points to page P = PT × 400 + n
directory 
Page P starts at address P × 1000
base Entry = (PT × 400 + n) × 1000 + 12 bits of OS information
address
Physical address = (PT × 400 + n) × 1000 + offset 1000h = 212 ⇒ 12 left‐shifts
Linear Address Physical Address
= PT × 400000 + n × 1000 + offset  400h = 210 ⇒ 10 left‐shifts
Translation Lookaside Buffer (TLB)
(address cache)
= PT shifted 22 left + n shifted 12 left + offset
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 25 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 26

Linear Address Format  Intel 80386 Microprocessor
Linear Address
22 bits
12 bits
directory page offset
10 10 12

Physical Address from page translation is  
A  = PT × 400000 + n × 1000 + offset 
PT × 400000 = PT followed by 22 binary 0’s 
n × 1000 = n followed by 12 binary 0’s  
22 bits
12 bits
PT n offset
10 10 12
Physical Address ≡ Linear Address  

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 27 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 28
Paging Options IA‐32 4 KB Paging
PG (paging)
Bit 31 of CR0
Enables IA-32 page translation
32-bit linear address → 32-bit physical address
32‐bit address
Page Size Extensions (PSE)
Bit 4 of CR4 32‐bit 
entry
Enables large page sizes 32‐bit 
4 MB pages entry
2 MB pages (with PAE flag set)

Physical Address Extension (PAE)


Bit 5 of CR4
Enables 36-Bit Physical Addressing Page Directory Base Register
32-bit linear address → 36-bit physical address

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 29 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 30

4 MB Paging 4 MB Page Directory Entry
No middle address field
Directory → single page table — 1024 entries
Directory entry → 4 MB page — 22 bit offset into page

32‐bit 
Address
32‐bit entry

Page Table Attribute Index (PAT)


Field introduced in Pentium III
Enables reference to a table of detailed attribute definitions

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 31 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 32
P6 Physical Address Extension (PAE) PAE — 4 KB Pages
36-bit physical address
32-bit linear address → 36-bit physical address
4 added address lines on CPU I/O bus
236 Bytes = 64 GB address space
36 bit
Address
Modified Directory + Page Table structure 64‐bit entry
Directory Pointer Table (DPT) 2 9 9 12 64‐bit entry
Pointer Dir Table Offset 
Top of table hierarchy
Defines 4 page table directories (first 2 bits in linear address)

Directory and page tables


64 bit entry → 36 bit physical addresses 64‐bit entry
29 = 512 entries / table × 64 bits / entry = 4 KB / table

Page size option


4 KB or 2 MB

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 33 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 34

Table Entry — 4 KB Pages PAE — 2 MB Pages

36 bit
Address
64‐bit entry

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 35 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 36
Table Entry — 2 MB Pages Accessing 64 GB Physical Memory
Page Table structure accesses 4 GB of 64 GB address space
Directory Pointer Table (DPT) → 22 = 4 directories
Directory → 29 = 512 page tables
Page table → 29 = 512 pages
Page = 212 bytes
Simultaneous address space = 22 × 29 × 29 × 212 bytes = 232 bytes
64 GB address space ⇒ DPT updates
Address space permits 16 different DPT tables
2(36 – 32) = (64 GB / 4 GB) = 16
4 of 64 possible directories "visible" at any time
Accessing additional 4 GB memory sections
Change base address for DPT
New table defines 4 new directories
Write new entries into DPT
Entries point to new directories
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 37 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 38

New Instructions for IA‐32 Segment‐Level Memory Protection
Instruction Description Segment tables "hide" segments from user
ARPL r/m16,r16 Adjust RPL of r/m16 to not less than RPL of r16
User program knows segment selectors
LAR r16,r/m16 Load Access Rights: r16 ← r/m16 masked by FF00H
Segment base address hidden in descriptor
LSL r16,r/m16 Load: r16 ← segment limit, selector r/m16
LSL r32,r/m32 Load: r32 ← segment limit, selector r/m32 Descriptor hidden in GDT / LDT
SGDT, SIDT m Store GDTR to m, Store IDTR to m GDT / LDT access — privileged machine instruction
SLDT r/m16 Stores segment selector from LDTR in r/m16 Local data / code segments
STR r/m16 Stores segment selector from Task Register in r/m16
VERR r/m16 Set ZF=1 if segment specified with r/m16 can be read
Defined in LDT
VERW r/m16 Set ZF=1 if segment specified with r/m16 can be written LDT selector hidden in Task State Segment (TSS)
CLTS Clears Task Switch flag in CR0 Tasks cannot locate / access
LGDT m16&32 Load m into GDTR TSS, LDT selector, LDT, segment defined in LDT
LIDT m16&32 Load m into IDTR
Hardware denies memory access on
LLDT r/m16 Load segment selector r/m16 into LDTR
LTR r/m16 Load r/m16 into task register Segment overflow
Offset in logical address > segment limit in descriptor
r = register m = memory pointer 16/32 = length in bits Action does not match access type in descriptor
r16={AX, CX, DX, BX, SP, BP, SI, DI} Write to CS / instruction fetch from DS / user access to system segment
r32={EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI}
Insufficient privilege level
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 39 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 40
Privilege Rings  Access Rights by Privilege
Segment access parameters Memory access operations by segment type
DPL field in segment descriptor Data segment access
Code performs load / store instruction
Base Access Limit Access Base Limit
Code segment access
11 10 9 8 7 6 5 4 3 2 1 0
Access G D 0 0 P DPL S TYPE A
Code performs jump / call / interrupt instruction

Current code
RPL field in segment selector
Selector in CS → descriptor → code segment containing instruction
Index TI RPL
13-bits 1 bit 2 bits Access rights
Selector in CS → descriptor → DPL in access field Access
Ring Function Granted
Current Privilege Level (CPL) = DPL of current CS Access
0 OS kernel mode Denied

1 Less sensitive OS functions 0 1 2 3 Forbidden accesses CPL


CPL

2 Protected user functions Load / store to data segment with DPL < CPL CPL DPL

3 Jump / call to code segment with DPL < CPL DPL 0 1 2 3


User mode DPL
CPL

Most systems use Ring 0 = Kernel Mode and Ring 3 = User Mode DPL

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 41 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 42

System Calls Gate Type System Segments (S = 0)
OS code Defines indirect access
Runs at DPL = 0 Selector → Gate in descriptor table
User code Instead of normal descriptor
Runs at CPL = 3 OS Gate provides new logical address SEG:OFFSET
Cannot jump or call OS code directly 0 1 2 3 Gate privilege permits ring 0 kernel access
user Word Count
Gate mechanism Privileged Stack — user stack segment reserved for system calls
OS advertises CS:EIP for system call call gate Gate call copies 32-bit words from User Stack to Privileged Stack
CS call points to special descriptor in GDT
Similar mechanisms for system call / interrupts / task switch Gate Format 
16  8  3  5  16  16 
OFFSET  access  0  word count  SEG  OFFSET
System call  

User calls CS:EIP Access   Gate Type Field


CS = selector → descriptor = Call Gate bit 7  bit 6,5  bit 4  bits 3, 2, 1, 0  Call Gate 1100

Call Gate completes system call P  DPL  S = 0 


  
type  Interrupt Gate 0110
Task Gate 0101

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 43 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 44
Call Gate The Trojan Horse Problem 
Problem
16 8 3 5 16 16
User program denied access to protected segment
destination offset access 0 word destination selector destination offset data DPL < user CPL
count
User program performs system call
access byte protected
Passes segment selector to OS as pointer data
bit 7 bit 6,5 bits 4,3,2,1,0
P DPL 01100
OS accesses protected data segment
(data DPL ≥ OS CPL)
0 1 2 3
OS
user

system
CS:EIP selector
call
GDT call gate

CS Shadow Register (CS descriptor) Solution


call gate
Request Protection Level (RPL) field in selector
destination word destination destination
access 0
offset count selector offset OS adjusts selector passed by user program
Sets RPL = user CPL
Access permitted iff DPL ≥ max (CPL , RPL)

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 45 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 46

Interrupt Service Hardware Task Creation 
Write into Task State Segment (TSS)
CS:EIP Instruction INT n
back link 
IDT stacks and stack pointers for CPL = 0, 1, 2 
task switch to higher DPL → switch to separate stack 
CS Shadow Register (CS descriptor) interrupt gate CR3 
destination word destination destination EIP 
access 0
offset count selector offset EFLAGS 
EAX, ECX, EBX, EDX, ESP, EBP, ESI, EDI 
ES, CS, SS, DS, FS, GS 
Execute INT n LDT Selector 
OS specific information 
CS:EIP of next instruction pushed onto stack
Interrupt Gate Write TSS descriptor
Address = IDT base + n × 8 Normal descriptor entered into GDT / LDT
Loaded to CS Shadow Register Points to TSS for task
Selector:offset from Interrupt Gate loaded to CS:EIP
Write Task Gate
CS:EIP = address of ISR (interrupt handler)
Gate descriptor entered into GDT / LDT / IDT
ISR finishes with IRET → pop previous CS:EIP
Destination selector points to TSS descriptor in GDT / LDT
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 47 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 48
Task Switching by Jump Task Switching by Jump
No nesting
CS:EIP JMP selector Back link not set
task GDT
CS Shadow Register (CS descriptor) gate Current code executes JMP to CS:EIP
destination
access 0
word destination destination
TSS descriptor
CS selector → Task Gate in GDT / LDT
offset count selector offset
Descriptor (Task Gate) loaded to CS Shadow Register

TSS Register CPU


TSS Shadow Register
Recognizes descriptor = Task Gate
Copies context of old task to old TSS
TSS
All
Loads Destination Selector from Task Gate → TSS Register
CPU Task Context Selector in TSS Register → TSS descriptor
Registers
Loads TSS descriptor to TSS Shadow Register
Loads new context from new TSS
Runs new task from CS:EIP from new task context

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 49 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 50

Context Switch Task Switching by Call / Return Instruction
4
6 Current code executes CALL to CS:EIP
Push CS:EIP of next instruction onto stack
3 CS selector → Task Gate in GDT / LDT
EAX CS CS Descriptor
New TSS Descriptor (Task Gate) loaded to CS Shadow Register
EBX DS DS Descriptor
ECX CPU
SS SS Descriptor
EDX ES ES Descriptor GDT Recognizes descriptor = Task Gate
ESI FS FS Descriptor Copies context of old task to old TSS
EDI GS GS Descriptor Writes old TSS Selector → back link of new TSS
EBP
5
Old TSS Loads Destination Selector → TSS Register
ESP GDT Address GDT Limit
RAM
Selector in TSS Register → TSS descriptor
EIP LDT Selector LDT Descriptor Loads TSS descriptor to TSS Shadow Register
TSS Selector TSS Descriptor Loads context from new TSS to run called task
1
Called task ends with IRET (or preemption)
2 CPU
1. New TSS selector Copies context of new task to new TSS
2. TSS descriptor auto-update Loads back link → TSS Register
3. Auto-save old context
4. Auto-load new context (values for LDT, segment registers, general registers) Selector in TSS Register → old TSS descriptor
5. Auto-update shadow registers for LDT Loads old TSS descriptor → TSS Shadow Register
6. Auto-update shadow registers for CS, DS, SS, ES, FS, GS
Loads context from old TSS → restore old task
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 51 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 52
Real Mode  Before Switching To Protected Mode
Start up mode for IA 32 processor OS starts in real mode
Processor runs like fast 8086 Uses 8086 mechanisms
Access only lowest 1 MB of memory Build GDT
OS boot code must be in low memory At least one Data Segment descriptor
Create pseudo-descriptors in shadow registers At least one Code Segment descriptor
Base Address field ← Selector × 10h
Build IDT
Limit ← FFFFh
Convert 32-bit 8086 ISR vectors to 64-bit ISR descriptors
CS access word
G  D  0  0  P  DPL  S  CODE  C  R  A 
Build TSS for OS scheduler
0  0  0  0  1  00  1  1  0  1  1  Put Task Gate for TSS into GDT
                     
DS, ES, FS, GS  access
    word
                Build Page Tables and Directory
G  0  0  0  P  DPL  S  CODE  ED  W  A 
0  0  0  0  1  00  1  0  0  1  1 
Linear Address = Physical Addresses
                      Write Directory Physical Address into TSS
SS access word                      
G  0  0  0  P  DPL  S  CODE  ED  W  A  Load GDT register and IDT register to CPU
0  0  0  0  1  00  1  0  1  1  1 

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 53 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 54

Entering Protected Mode Instruction Types
Set flag PE in CR0 New instruction encoding for IA-32
Enter protected mode

JMP to Task Gate in GDT


Loads Task Register
Selector points to TSS Descriptor Instruction prefix changes width of default instruction
CPU loads scheduler context from TSS
Code Type Operand Width Address Width
No Prefix 0x66 No Prefix 0x67
Set flag PG in CR0 16-bit code
16 bits 32 bits 16 bits 32 bits
Enable paging (optional) No Prefix 0x66 No Prefix 0x67
32-bit code
OS scheduler now running in protected mode with paging 32-bits 16 bits 32-bits 16 bits

Example for 16-bit code


OS creates processes by writing
With prefix 66B844332211 → mov eax,0x11223344
TSS
Without prefix B844332211 → B84433 → mov ax,0x3344
GDT entries 1122 → and dl,[bx+di]

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 55 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 56
Example Code Disassemble
ORG 0x100 00000000 66B844332211 mov eax,0x11223344
section .text 00000006 6650 push eax
mov eax,11223344h
push eax
00000008 665B pop ebx
pop ebx 0000000A E80E00 call 0x1b
call disp32 0000000D 66BB88776655 mov ebx,0x55667788
mov ebx,55667788h 00000013 E80500 call 0x1b
call disp32 00000016 B8004C mov ax,0x4c00
mov ax,4C00h 00000019 CD21 int 0x21
int 21h
0000001B B90800 mov cx,0x8
disp32: 0000001E B402 mov ah,0x2
mov cx,08h ; counter = 8 00000020 66C1C304 rol ebx,0x4
mov ah,02h ; DOS function is print byte 00000024 88DA mov dl,bl
nibble: 00000026 80E20F and dl,0xf
rol ebx,4 ; move most significant nibble to least
00000029 80C230 add dl,0x30
mov dl,bl ; load BL to print buffer
and dl,0fh ; zero upper nibble 0000002C 80FA39 cmp dl,0x39
add dl,30h ; ASCII digit range 0000002F 7E03 jng 0x34
cmp dl,39h ; is nibble in [A-F] 00000031 80C207 add dl,0x7
jle go ; if not > 9 print 00000034 CD21 int 0x21
add dl,7h ; if > 9 ASCII letter range 00000036 E2E8 loop 0x20
go: int 21h ; print the byte
loop nibble ; CX-- and continue
00000038 B20D mov dl,0xd
mov dl, 0dh ; CR 0000003A CD21 int 0x21
int 21h 0000003C B20A mov dl,0xa
mov dl, 0ah ; LF 0000003E CD21 int 0x21
int 21h 00000040 C3 ret
ret

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 57 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 58

Example of 32‐bit Address Overrides Assembler Output
00000100 66B861626364 mov eax,0x64636261
ORG 0x100 00000106 66BB41424344 mov ebx,0x44434241
section .data 0000010C 66BD00200000 mov ebp,0x2000
filename db "test.txt",0
00000112 6667894500 mov [ebp+0x0],eax
section .bss
00000117 6667895D04 mov [ebp+0x4],ebx
handle resw 1
section .text 0000011C BA4801 mov dx,0x148
mov eax,'abcd' 0000011F B90000 mov cx,0x0
mov ebx,'ABCD' 00000122 B43C mov ah,0x3c
mov ebp,2000h 00000124 CD21 int 0x21
mov [ebp],eax 00000126 721B jc 0x43
mov [ebp+4],ebx 00000128 A35401 mov [0x154],ax
create: mov dx,filename ; point to file name 0000012B 8B1E5401 mov bx,[0x154]
mov cx,0h ; default attributes 0000012F B90800 mov cx,0x8
mov ah,3ch ; DOS create file 00000132 6689EA mov edx,ebp
int 21h ; DOS system call 00000135 B440 mov ah,0x40
jc end ; stop on error 00000137 CD21 int 0x21
mov [handle],ax ; store file handle 00000139 7208 jc 0x43
write: mov bx,[handle] ; copy file handle to BX 0000013B 8B1E5401 mov bx,[0x154]
mov cx,8h ; write 8 bytes to file 0000013F B43E mov ah,0x3e
mov edx,ebp ; point EDX to buffer
00000141 CD21 int 0x21
mov ah,40h ; DOS write to file
int 21h ; DOS system call 00000143 B8004C mov ax,0x4c00
jc end ; stop on error 00000146 CD21 int 0x21
close: mov bx,[handle] ; copy file handle to BX 00000148 7465 jz 0xaf
mov ah,3eh ; DOS close file 0000014A 7374 jnc 0xc0
int 21h ; DOS system call 0000014C 2E7478 cs jz 0xc7
end: mov ax,4C00h ; return to DOS 0000014F 7400 jz 0x51
int 21h ; DOS system call
C:\nasm\programs>type TEST.TXT
abcdABCD
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 59 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 60
Operating Modes for Intel x86 Processors Why 64 bits?
Features of true 64-bit architecture
x86 64-bit ALU integer operands
64-bit general purpose register set
64-bit flat virtual address space

IA‐32 Mode x86‐64 Advantages of 64-bit architecture


Huge virtual address space
264 Bytes = 24 × (230)2 = 16 Giga-GB = 16 Exa-Bytes
Serve many users accessing huge data bases
Real  Protected  V86  Compatibility  64‐bit  Perform high precision arithmetic efficiently
Mode Mode Mode Mode Mode
64-bit integer ALU and 128-bit long ALU operations
16‐bit  32‐bit  16‐bit  32‐bit  64‐bit  Perform scientific and CAD/CAM/CAE calculations
Application Application Application Application Application
Examples of true 64-bit architecture
16‐bit OS 32‐bit OS 64‐bit OS
16‐bit applications not supported  PowerPC, Sparc, Alpha, IA-64 (Itanium)

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 61 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 62

Metric Prefixes Data Types

Kilo K 103 210 1,024


Mega M 106 220 1,048,576
Giga G 109 230 1,073,741,824
Terra T 1012 240 1,099,511,627,776
Peta P 1015 250 1,125,899,906,842,624
Exa E 1018 260 1,152,921,504,606,846,976

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 63 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 64
Operand Ranges How to Think About x86‐64 ?
x86-64 not true 64-bit architecture
Optimized for default 32-bit integer
64-bit integer operations by override

Optimized for default 32-bit register accesses


64-bit register accesses by override

64-bit virtual address space


"Tricks" standard IA-32 segmentation system

Why x86-64 — Intel


Easy migration path from IA-32 to 64-bits
Provides some 64-bit features
Preserves IA-32 Instruction Set Architecture (ISA)
Preserves most IA-32 software in most circumstances
Preserves IA-32 "knowledge base"

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 65 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 66

What Intel Said 64‐bit Extensions
Operands
IA-32 registers → 64-bit width
The move toward 64-bit computing for mainstream applications will initially focus on
EAX, EBX, ECX, EDX, ESP, EBP, ESI, EDI, EIP →
applications that are already constrained by 32-bit memory limitations.
RAX, RBX, RCX, RDX, RSP, RBP, RSI, RDI, RIP
The challenge for IT organizations is to determine the best architecture for specific
8 new general purpose registers (GPR)
solutions, while taking into account total cost and value within the broader IT and
business environments. R8, R9, ... , R15
Default 32-bit integer operand
Itanium architecture remains the platform of choice for the most demanding,
business-critical data tier applications, such as high-end database and business Override → 64-bit integer ALU operations
intelligence solutions. Address
Platforms based on the Intel Xeon processor with Intel EM64T are preferable for 64-bit flat linear address space
general purpose applications, such as Web and mail infrastructure, digital content Logical address = SEG:OFFSET
creation, mechanical computer aided design, and electronic design automation; SEG CS, DS, ES, SS → physical base address = 0
and for mixed environments in which optimized 32-bit performance remains
critical.
64-bit OFFSET = Linear Address
Segmentation enforces protection
The 64-bit Tipping Point, September 2004
64-bit paging
52-bit physical address
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 67 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 68
Summary of Operating Modes 64‐bit Applications — Typical Features
Defaults  Typical  Code fetch
Operating  Register  GPR  CS → code segment base address = 0
Operating Mode  Address  Operand 
System 
Size (bits)  Size (bits)
Extensions Width  64-bit instruction pointer RIP
(bits) 
Linear Address = RIP
64‐Bit Mode  64  yes  64 
32 
x86‐64  Compatibility  64‐bit OS  32  32  IA-32 instruction syntax
no 
Mode  16  16  16  PUSH, POP, MOV, ADD, SUB, … work as usual
Protected  32  32  32 
32‐bit OS  Most instructions use 32-bit operands
IA‐32  Mode  no 
16  16  16  MOV EAX, 11223344h
Real Mode  16‐bit OS 
ADD EAX, [EBX+ESI+11223344]
Booting 64-bit OS
64-bit virtual address
Initialize CPU into real mode
Switch CPU to 32-bit protected mode ADD EAX, [EBX+ESI+11223344]
Switch CPU to 64-bit mode Logical Address = DS:EBX+ESI+11223344
OS runs DS → data segment base address = 0
64-bit applications Sign extend EA = EBX+ESI+11223344 → 64-bit EA64
32-bit applications (in compatibility mode) Linear Address = EA64
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 69 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 70

IA‐32 Prefixes Prefixes for Instruction Encoding
IA-32 code prefixes — override default parameters General instruction encoding
66H Legacy REX
Opcode ModR/M SIB Displacement Immediate
16-bit code — 16-bit operand → 32-bit operand prefixes prefix
32-bit code — 32-bit operand → 16-bit operand Legacy prefix = IA-32 prefix
67H ModR/M = mod-reg-r/m
16-bit code — 16-bit address offset → 32-bit address offset SIB = scale-index-base
32-bit code — 32-bit address offset → 16-bit address offset Effective Address = base + scale * index + displacement

Example REX prefixes


16-bit code fragment Override default operand / address size
66B861626364 mov eax,0x64636261 Combined with 66H and 67H codes
6667894500 mov [ebp+0x0],eax W — operand width
6667895D04 mov [ebp+0x4],ebx
R — register (in ModR/M)
4 1 1 1 1
32-bit code fragment X — index (in ModR/M) 0100 W R X B
66B86162 mov ax,0x6261 B — base (in ModR/M)

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 71 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 72
IA‐32 and REX Overrides Register Access
Default Address   Linear Address  
Mode  Prefix  
Size (bits)  Size (bits) 
64  — 
64‐bit  64 
32  0x67
32  — 
32 
16  0x67
Compatibility 
32  0x67
16 
16  — 

Default  Effective  Instruction Prefix 


Mode  Operand Size  Operand Size  IA‐32 
REX.W
(bits)  (bits)  Prefix 
64  Ignore   1 Register accesses
64‐bit  32  32  — 
0 64-bit operations access entire register
16  66H  32-bit operations access lower 32-bits of 64-bit registers (default)
32  —  16-bit operations access lower 16-bits of 64-bit registers (where permitted)
32 
16  66H 
Compatibility  —  8-bit operations access lower 8-bits of 64-bit registers
32  66H 
16  Access lower 8-bits of legacy base/index registers
16  — 
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 73 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 74

Instructions and Operands Other Instructions
Default effective address is 64 bits Branch
EA = Base + Scale * Index + Displacement Near branch — default target address = 64 bits JMP targ
Default operand — 32 bits Far branch — use indirect target JMP [pointer]
ADD EAX,[RBX] Loop instructions check RCX
REX override operand → 64 bits 64 Stack instructions
ADD RAX,[RBX] Operand — 64 bits
Most displacements — 32 bits PUSH RAX
ADD EAX,[RSI+11223344] 32
[RBX] New addressing mode (in kernel mode)
ADD RAX,[RSI+11223344]
RIP-relative
Special form of MOV — 64-bit absolute address
MOV RAX,[1122334455667788] EA = 64-bit RIP + 32-bit displacement
Most immediates — 32 bits String instructions
ADD EAX,11223344 LODSQ ; RAX ← [DS:RSI] , RSI ← RSI + 8
Sign extended immediates for 64-bit operation STOSQ ; [ES:RDI] ← RAX , RDI ← RDI + 8
ADD RAX,11223344 MOVSQ
Special form of MOV — 64-bit immediate CMPSQ
MOV RAX,1122334455667788
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 75 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 76
Segmentation Model Segmentation in x86‐64
Segment selectors
descriptor As in IA-32
Global
Descriptor
Table
(GDT)
descriptor
Descriptor tables
selector Direct descriptor table registers
Global Descriptor Table Register (GDTR)
selector GDT Base
(GDTR) Local Descriptor Table Register (LDTR)
selector
Interrupt Descriptor Table Register (IDTR)
Local Located at 64-bit linear base address
Descriptor
Table
descriptor 10-byte registers
(LDT) 64-bit table base address (8 bytes)
Segment 16-bit table limit (2 bytes)
Base
LDT Base
(LDTR) Segment
Attributes Descriptors
Segment Segment Shadow
Register Register Segment User descriptor structure identical to IA-32
Limit
System descriptors expanded to 128 bits
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 77 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 78

Selector/Descriptor Formats Segmentation Process in 64‐bit Mode
Code segment
Load selector to CS register ⎯ selector points to descriptor
Load descriptor to shadow register
Bit L = 1 ⇒ 64-bit mode (L = 0 ⇒ compatibility mode)
Check DPL = current privilege level
Other descriptor fields ignored

Data, stack, and extra segments


No selector or descriptor loaded
No attribute checking

FS and GS
Load selector to FS/GS register ⎯ selector points to descriptor
Load descriptor to shadow register
Shadow register expanded → 64-bit segment base address
Other descriptor fields ignored
Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 79 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 80
Canonical Virtual Address Physical Address Extension (PAE) Paging
Virtual (linear) address 4 level paging system for 36-bit physical address
Maximum address length = 64 bits 64-bit Directory Pointer Table entry points to Directory
Minimum implemented address length = 48 bits 64-bit Directory entry points to Page Table / Page
248 bytes = 256 × 210 × 230 = 256 Mega-GB = 256 TB
64-bit Page Table entry points to Page
Procedures for address longer than 48 bits are defined by Intel or AMD
Canonical Form 12/21-bit Offset points to byte in Page
Bit MSB+1 to bit 63 = copy of MSB
Sign-extended format
Splits address space into "positive" and "negative" sectors
Used by OS for system management

sign-extension: copies of MSB

63 MSB 48 47 0

Minimum Virtual Address (512 table entries per table) × (64-bits = 8 bytes per entry) = 4 KB / table
Implemented Virtual Address

Maximum Virtual Address

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 81 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 82

64‐bit Paging 2 MB Page Size in 64‐bit Mode

48 significant bit
canonical address

Table structure
Entries as in PAE
512 entries in Pointer Table (4 in PAE)
Added Page-Map Table at top of hierarchy defines 512 Pointer Tables

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 83 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 84
Table Entries Entering 64‐bit Mode
Directory Directory Directory Page OS running IA-32 protected mode with paging enabled
Bit Page Map
Pointer (4 KB Page) (2 MB Page) Table Disable paging
0 P ⎯ Present / Not Present Enable physical address extensions (PAE)
1 R/W ⎯ Read Only / Writeable
Allows 52-bit physical addresses
2 U/S ⎯ User / Supervisor
3 PWT ⎯ Page Level Write Through Load physical base address of PML4 (top paging table) to CR3
4 PCD ⎯ Page Level Cache Disable Enable x86-64 mode
5 A ⎯ Accessed / Not Accessed
Enable 64-bit paging
6 Reserved D ⎯ Dirty
7 Reserved 0 1 PAT
8 Reserved G ⎯ Global OS now running in 64-bit mode with 64-bit paging
9 – 11 Available to OS
GDTR, LDTR, IDTR, TR still point to IA-32 descriptor tables
12 PAT
Directory Disable exceptions and interrupts
13 Directory Page Table Reserved Page
Pointer
Address Address Page Address Execute LGDT, LLDT, LIDT, and LTR
21 – 39 Address
Address Load physical base addresses to 64-bit descriptor tables
40 – 51 Reserved
52 – 62 Available to OS Enable exceptions and interrupts
63 Reserved

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 85 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 86

64‐bit Mode → Compatibility Mode Set Up Compatibility Mode for Application 


No change to In 64-bit mode
Segment registers / segment shadow registers Load DS, ES, SS with selectors
Descriptor table physical base registers MOV SREG, source / POP SREG, source
Physical base address of PML4 (top paging table) CPU loads descriptor from GDT / LDT
CPU creates "virtual protected mode" environment Descriptor base, limit, and attribute loaded to shadow registers
CS descriptor checked for bit L 64-bit mode ignores
1 — 64-bit mode
Contents of data and stack segment selectors
0 — indicates compatibility mode
Descriptor shadow registers
Other descriptor fields treated as in IA-32
IA-32 segmentation and paging enabled Call / jump / interrupt / task switch to compatibility mode CS
16-bit / 32-bit address and operand sizes CPU loads selector to CS
Access to lower 4 GB of linear address space CPU loads CS descriptor from GDT / LDT
IA-32 instruction prefixes and registers Descriptor base, limit, and attribute loaded to shadow register
32-bit registers and memory accesses CS.L = 0 ⇒ compatibility mode code segment
REX prefixes ignored CPU runs code in compatibility mode

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 87 Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 88
Address Translation in x86‐64

Modern Microprocessors — Fall 2012 IA-32 Dr. Martin Land 89

You might also like