Virtual Memory
Virtual Memory
Applications
Operating System
CPU Hardware
Disk RAM
2
Problems with physical memory
• Size?
• Holes in address space?
• Isolation?
3
Problems with physical memory
• Crash if trying to access more than what we have
• Fragmentation
• Multiple programs may access the same address
4
Virtual vs. Physical Memory
• Physical memory: the actual memory installed in the computer
• Virtual memory: logical memory space owned by each process
– Virtual memory space can be much larger than the physical memory space
– Only part of the program needs to be in physical memory for execution
Phy. size
Process 1
Process 2
Process 3
0
Virtual memory Physical memory
5
Virtual Address
• Processes use virtual (logical) addresses
– Make it easier to manage memory of multiple processes
– Virtual addresses are independent of the actual physical location of data
referenced
– Instructions executed by the CPU issue virtual addresses
– Virtual address are translated by hardware into physical addresses (with
help from OS)
6
Virtual Address
*MMU: Memory Management Unit
Virtual address
Physical
CPU MMU
Memory
Physical address
• MMU: Translates virtual address to physical address dynamically at every
reference
– Many ways to do this translation…
– Need hardware support and OS management algorithms
• Requirements
– Protection – restrict which addresses processes can use
– Fast translation – lookups need to be fast
– Fast change – updating memory hardware on context switch
7
External Fragmentation
Physical memory
Size
Fragment
Process 4
Process 2
Fragment
Not enough contiguous physical space
Process 1
0 Fragment
8
Paging
• Main Idea: Split address space into equal sized units called pages
– Each can go anywhere! e.g., page size of 4KB
Physical Memory
Virtual Memory
Page 1
Page 2
Page 3
Page N
9
Page Lookups
• Page table: an array of page table entries (PTEs) that maps
virtual pages to physical pages
– Per-process kernel data structure (part of process’s state) (why?)
Physical Memory
Page size = 2p
Virtual Address
Page number Offset
p bits
Physical Address
Page Table
Page frame Offset
Page frame
10
Why Virtual Memory (VM)?
• Simplifies memory management
– Each process gets the same uniform linear address space
11
Example Memory Hierarchy
L0:
CPU registers hold words retrieved from L1
Registers cache
L1: L1 cache
(SRAM) L1 cache holds cache lines retrieved from
Smaller, L2 cache
faster,
costlier L2:
per byte L2 cache
(SRAM) L2 cache holds cache lines retrieved
from main memory
L3:
Main memory
Larger,
(DRAM) Main memory holds disk blocks retrieved
slower, from local disks
cheaper
per byte
L4: Local secondary storage Local disks hold files retrieved
(local disks) from disks on remote network
servers
12
Memory hierarchy
• Cache: A smaller, faster storage device that acts as a staging
area for a subset of the data in a larger, slower device.
• Big Idea: The memory hierarchy creates a large pool of storage that costs as
much as the cheap storage near the bottom, but that serves data to
programs at the rate of the fast storage near the top.
13
VM as a tool for caching
• Virtual memory mapped to N contiguous bytes stored on disk
• The contents of the array on disk are cached in physical memory
– DRAM as a cache for disk
– These cache blocks are called pages (size is P = 2p bytes)
14
Page Tables
• Page table: page table entries (PTEs) map virtual to physical pages
– PTE: - physical page number, if page is cached (resident) in DRAM
- disk address, otherwise
Physical memory
Physical page (DRAM)
number or
VP 1 PP 0
Valid disk address
VP 2
PTE 0 0 null
VP 7
1
VP 4 PP 3
1
0
1
0 null Virtual memory
0 (disk)
PTE 7 1 VP 1
Memory resident VP 2
page table
(DRAM) VP 3
VP 4
VP 6
VP 7
15
Page Hit
• Page hit: Access to a page that is in physical memory
Physical memory
Virtual address Physical page (DRAM)
number or
VP 1 PP 0
Valid disk address
VP 2
PTE 0 0 null
VP 7
1
VP 4 PP 3
1
0
1
0 null Virtual memory
0 (disk)
PTE 7 1 VP 1
Memory resident VP 2
page table
(DRAM) VP 3
VP 4
VP 6
VP 7
16
Page Fault
• Page fault: Access to a page that is not in physical memory
Physical memory
Physical page (DRAM)
Virtual address number or
VP 1 PP 0
Valid disk address
VP 2
PTE 0 0 null
VP 7
1
VP 4 PP 3
1
0
1
0 null Virtual memory
0 (disk)
PTE 7 1 VP 1
Memory resident VP 2
page table
(DRAM) VP 3
VP 4
VP 6
VP 7
17
Handling Page Fault
• Page miss causes page fault (an exception)
Physical memory
Physical page (DRAM)
Virtual address number or
VP 1 PP 0
Valid disk address
VP 2
PTE 0 0 null
VP 7
1
VP 4 PP 3
1
0
1
0 null Virtual memory
0 (disk)
PTE 7 1 VP 1
Memory resident VP 2
page table
(DRAM) VP 3
VP 4
VP 6
VP 7
18
Handling Page Fault
• Page miss causes page fault (an exception)
• Page fault handler selects a victim to be evicted (here VP 4)
Physical memory
Physical page (DRAM)
Virtual address number or
VP 1 PP 0
Valid disk address
VP 2
PTE 0 0 null
VP 7
1
VP 4 PP 3
1
0
1
0 null Virtual memory
0 (disk)
PTE 7 1 VP 1
Memory resident VP 2
page table
(DRAM) VP 3
VP 4
VP 6
VP 7
19
Handling Page Fault
• Page miss causes page fault (an exception)
• Page fault handler selects a victim to be evicted (here VP 4)
Physical memory
Physical page (DRAM)
Virtual address number or
VP 1 PP 0
Valid disk address
VP 2
PTE 0 0 null
VP 7
1
(empty) PP 3
1
0
0
0 null Virtual memory
0 (disk)
PTE 7 1 VP 1
Memory resident VP 2
page table
(DRAM) VP 3
VP 4
VP 6
VP 7
20
Handling Page Fault
• Page miss causes page fault (an exception)
• Page fault handler selects a victim to be evicted (here VP 4)
Physical memory
Physical page (DRAM)
Virtual address number or
VP 1 PP 0
Valid disk address
VP 2
PTE 0 0 null
VP 7
1
VP 3 PP 3
1
1
0
0 null Virtual memory
0 (disk)
PTE 7 1 VP 1
Memory resident VP 2
page table
(DRAM) VP 3
VP 4
VP 6
VP 7
21
Handling Page Fault
• Page miss causes page fault (an exception)
• Page fault handler selects a victim to be evicted (here VP 4)
• Offending instruction is restarted: page hit! Physical memory
Physical page (DRAM)
Virtual address number or
VP 1 PP 0
Valid disk address
VP 2
PTE 0 0 null
VP 7
1
VP 3 PP 3
1
1
Re-run the same instruction 0
à Page hit 0 null Virtual memory
0 (disk)
PTE 7 1 VP 1
Memory resident VP 2
page table
(DRAM) VP 3
VP 4
VP 6
VP 7
22
Overhead due to a page fault
• PTE informs the page is on disk: 1 cycle
• CPU generates a page fault exception: 100 cycles
• OS page fault handler called
– OS chooses a page to evict from DRAM and write to disk: 10k cycles
– Write dirty page back to disk first: 40m cycles
– OS then reads the page from disk and put it in DRAM: 40m cycles
– OS changes the page table to map the new page: 1k cycles
• OS resumes the instruction that caused the page faul: 10k cycles
• Page faults are the slowest thing (except for human interactions)
• Interestingly, there are systems do not page:
– iOS: kills the program if using too much memory
23
VM as a Tool for Mem Management
• Each process has its own virtual address space
– It can view memory as a simple linear array
– Mapping function scatters addresses through physical memory
• Well chosen mappings simplify memory allocation and management
0 Address 0
Virtual Physical
Address translation Address
VP 1
Space for Space
VP 2 PP 2
Process 1: ... (DRAM)
N-1
PP 6 (e.g., read-only
library code)
0
Virtual
PP 8
Address VP 1
Space for VP 2 ...
Process 2: ...
N-1 M-1
24
VM as a Tool for Mem Management
• Memory allocation
– Each virtual page can be mapped to any physical page
– A virtual page can be stored in different physical pages at different times
• Sharing code and data among processes
– Map virtual pages to the same physical page (here: PP 6)
0 Address 0
Virtual Physical
Address translation Address
VP 1
Space for Space
VP 2 PP 2
Process 1: ... (DRAM)
N-1
PP 6 (e.g., read-only
library code)
0
Virtual
PP 8
Address VP 1
Space for VP 2 ...
Process 2: ...
N-1 M-1
25
Sharing
• Can map shared physical memory at same or different virtual addresses in
each process’ address space
– Same VA: Physical
Process P1 Process P2
• Both P1 and P2 maps the 1st virtual Memory
page (VP1) to the shared physical
page (PP2) VP 1 VP 1
26
Copy on Write
• OSes spend a lot of time copying data
– System call arguments between user/kernel space
– Entire address spaces to implement fork()
27
fork() without Copy on Write
Parent process’s
page table
Page 1 Physical Memory
Page 2
Page 1
Page 1 Page 2’
Page 2
Page 2
fork()
Page 1’
Child process’s
page table
28
fork() with Copy on Write
Parent process’s
page table
Page 1 Physical Memory
Page 2
Page 1
Page 1
Page 2
Page 2
fork()
Child process’s
page table
29
fork() with Copy on Write
When either process modifies Page 1,
page fault handler allocates new page and
Parent process’s updates PTE in child process
page table
Page 1 Physical Memory
Page 2
Page 1
Page 1 Page 1’
Page 2
Page 2
fork()
Child process’s
page table
30
Simplifying Linking and Loading
Kernel virtual memory Memory
• Linking 0xc0000000 invisible to
User stack
user code
– Each program has similar virtual (created at runtime)
%esp
address space (stack
– Code, stack, and shared libraries pointer)
always start at the same address
Memory-mapped region for
shared libraries
• Loading 0x40000000
0x08048000
Unused
0 31
VM as a Tool for Mem Protection
• Extend PTEs with permission bits
• Page fault handler checks these before remapping
– If violated, send process SIGSEGV (segmentation fault)
Physical
Process i: SUP READ WRITE Address Address Space
VP 0: No Yes No PP 6
VP 1: No Yes Yes PP 4
VP 2: Yes Yes Yes PP 2 PP 2
•
• PP 4
•
PP 6
Process j: SUP READ WRITE Address
PP 8
VP 0: No Yes No PP 9 PP 9
VP 1: Yes Yes Yes PP 6
VP 2: No Yes Yes PP 11 PP 11
* SUP = supervisor mode (kernel mode) 32
Address Translation: Page Hit
VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
2
CPU Chip PTEA
1
PTE
VA
CPU MMU 3 Cache/
PA Memory
4
Data
5
2
CPU Chip PTEA Victim page
1
5
VA PTE Cache/
CPU MMU Disk
7 3 Memory
New page
PTE
PA Data
hit
L1
Data cache
VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
35
Elephant(s) in the room
• Problem 1: Translation is slow!
• Many memory accesses for each memory access
• L1 cache is not effective
• Problem 2: Page table can be gigantic!
• We need one for each process
36
Speeding up Translation with a TLB
• If page table entries (PTEs) are cached in L1 like any other data
– PTEs may be evicted by other data references
– PTE hit still requires a small L1 access delay
37
TLB Hit
CPU Chip
TLB
2 PTE
VPN 3
1
VA PA
CPU MMU
4 Cache/
Memory
Data
5
CPU Chip
TLB 4
2 PTE
VPN
1 3
VA PTEA
CPU MMU
Cache/
PA Memory
5
Data
6
40
Multi-Level Page Tables
• Example: 2-level paging
Level 1 table Level 2
Tables
– Each PTE points to a L2
page table
– Always memory resident Level 1
Table
Level 2 table
Do not create (or
...
– Each PTE points to a page swap out to disk)
– Paged in and out like any if not needed
other data
...
41
Two-Level Page Table Hierarchy
Level 1 Level 2 Virtual
page table page tables memory
0
VP 0
...
PTE 0 PTE 0
VP 1023 2K allocated VM pages
PTE 1 ...
for code and data
VP 1024
PTE 2 (null) PTE 1023
...
PTE 3 (null)
VP 2047
PTE 4 (null) PTE 0
PTE 5 (null) ...
PTE 6 (null) PTE 1023
PTE 7 (null) Gap
Two-Level Paging
43
Intel x86-64 Paging
• Current generation Intel x86 CPUs
– 64 bit architecture: maximum 264 bytes address space
• In practice, only implement 48 bit addressing
– Page sizes of 4 KB or 4 MB (optionally, 2 MB or 1 GB)
– Four levels of paging hierarchy
• Only use 48 bit addressing à 256 TiB of virtual address space
44
ARM Paging
• 4 KB, 16 KB, 1 MB pages
• 32 bit architecture
– Up to two-level paging
• 64 bit architecture
– Up to three-level paging
45
RISC-V Paging (xv6 runs on Sv39 RISC-V)
46
• Consider two-level paging with a page size of 4 KB.
– 32 bit address space
– p1 is 12 bits
– p2 is 10 bits
– PTE size is 8 bytes
• How many Level-2 page tables can be created in the worst case?
47
• Consider two-level paging with a page size of 4 KB.
– 32 bit address space
– p1 is 12 bits
– p2 is 10 bits
– PTE size is 8 bytes
48