0% found this document useful (0 votes)
15 views

13-vm-notes

Uploaded by

Vishakha Agarwal
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

13-vm-notes

Uploaded by

Vishakha Agarwal
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 60

Virtual Memory

Vishal Shrivastav
CS 3410
Computer System Organization & Programming

These slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, and Sirer.
Where are we now and where are we going?
• How many programs do you run at once?

• a) 1
• b) 2
• c) 3-5
• d) 6-10
• e) 11+
Big Picture: Multiple Processes
How to run multiple processes?

• Time-multiplex a single CPU core (multi-tasking)


 Web browser, skype, office, … all must co-exist

• Many cores per processor (multi-core)


or many processors (multi-processor)
 Multiple programs run simultaneously
Processor & Memory
• CPU address/data bus...
• … routed through caches
• … to main memory 0xfff…f
CPU
 Simple, fast, but… 0x7ff…f
Stack
$$

Heap
Data
Text

0x000…0
Memory
Multiple Processes
• Q: What happens when another program is
executed concurrently on another processor?
0xfff…f
CPU
0x7ff…f
Stack
$$ Stack
$$
Heap
Heap
CPU
Data
Data
• A: The addresses will conflict Text
Text
 Even though, CPUs may take 0x000…0
turns using memory bus Memory
Multiple Processes
• Q: Can we relocate second program?
0xfff…f
CPU
0x7ff…f
Stack
Stack

Heap
Heap
CPU
Data
Data
Text
Text
0x000…0
Memory
Solution? Multiple processes/processors

• Q: Can we relocate second program? Stack


• A: Yes, but…
CPU Data
 What if they don’t fit?
 What if not contiguous? Stack
 Need to recompile/relink? Heap
 … Heap
CPU
Data
Text
Text
Memory
Big Picture: (Virtual) Memory

Process 1 3 A
2 B
1 C
0 D Give each process an illusion
that it has exclusive access to
entire main memory

Process 2
3 E
2 F
1 G
0 H
But In Reality…
14
D 13
12

Process 1 E 11
10
C 9
8
7
B
G 6
H 5
4
Process 2
A 3
2

F 1
0

Physical Memory
How do we create the illusion?
14
D 13
12

Process 1 3 A E 11
2 B 10
1 C 9
0 D C
8
7
B
G 6
H 5

Process 2
3 E 4
2 F A 3
1 G 2
0 H
F 1
0

Physical Memory
How do we create the illusion?
14
D 13
12

Process 1 3 A E 11
2 B 10
1 C 9
All problems
0 in computer
D science can be solved by C
another
8
level of indirection. 7
B
– David G
Wheeler
6
5
H
Process 2
3 E 4
2 F A 3
1 G 2
0 H
F 1
0

Physical Memory
How do we create the illusion?
14
D 13
12

Process 1 3 A E 11
2 B 10
1 C Map virtual

Physical address
address to C 9
0 D physical address 8
Virtual address 7
Memory B
G 6
management unit
(MMU) takes care H 5

Process 2
3 E of the mapping 4
2 F A 3
1 G 2
0 H
F 1
Virtual Memory
0
(just a concept; does not exist physically)
Physical Memory
How do we create the illusion?
14
Process 1 wants to 13
access data C D
12
Process 1 thinks it
Process 1 3 A is stored at addr 1 E 11
2 B So CPU generates 10
1 C addr 1

Physical address
C 9
0 D This addr is 8
Virtual address intercepted by 7
MMU B
G 6
MMU knows this
is a virtual addr H 5

Process 2
3 E MMU looks at the 4
2 F mapping
A 3
1 G Virtual addr 1 ->
Physical addr 9 2
0 H Data at Physical
addr 9 is sent to F 1
Virtual Memory
CPUphysically) 0
(just a concept; does not exist
And that data is
indeed C!!! Physical Memory
How do we create the illusion?
14
D 13
12

Process 1 3 A E 11
2 B 10
1 C Map virtual
address to C 9
0 D physical address 8
7
Memory B
G 6
management unit
(MMU) takes care H 5

Process 2
3 E of the mapping 4
2 F 3
1 G 2
0 H
1
Virtual Memory
0
A F Physical Memory
Disk
Big Picture: (Virtual) Memory
• From a process’s perspective –

Hidden from Process


Process 1 3 A
2 B
1 C C
0 D
Virtual Memory Physical Memory

 Process only sees the virtual memory


Contiguous memory
Big Picture: (Virtual) Memory
• From a process’s perspective –

Hidden from Process


Process 1 3 A
2 B
1 C C
0 D
Virtual Memory Physical Memory

 Process only sees the virtual memory


Contiguous memory
No need to recompile - only mappings need to be
updated
Big Picture: (Virtual) Memory
• From a process’s perspective –

Hidden from Process


C
Process 1 3 A
2 B
1 C
0 D
Virtual Memory Physical Memory

 Process only sees the virtual memory


Contiguous memory
No need to recompile - only mappings need to be
updated
Big Picture: (Virtual) Memory
• From a process’s perspective –

Hidden from Process


Process 1 3 A
2 B
1 C
0 D
Virtual Memory Physical Memory

 Process only sees the virtual memory


Contiguous memory C Disk
No need to recompile - only mappings need to be
updated
When run out of memory, MMU maps data on disk in a
transparent manner
Next Goal
• How does Virtual Memory work?

• i.e. How do we create the “map” that maps a


virtual address generated by the CPU to a
physical address used by main memory?
Virtual Memory Agenda
What is Virtual Memory?
How does Virtual memory Work?
• Address Translation
• Overhead
• Paging
• Performance

20
Picture Memory as… ?
Byte Array: Segments: New! Page Array:
addr data 0xfffffffc
0xfffffff xaa system page n
0xfffff000
f 0x80000000 reserved
… 0x7ffffffc 0xffffe000
… stack
0xffffd000
x00
h s eg m ent ...
eac
s es s ome #
u
of pages
heap
0x00004000
0x10000000 data ...
0x00003000

text 0x00002000
page 2
x00
0x00400000 page 1
xef 0x00001000
system
xcd
0x00000000 reserved 0x00000000
page 0
xab 21
A Little More About Pages
Page Array: Memory size = depends on system
4KB say 4GB
0xfffff000

0xffffe000 Page size = 4KB (by default)


0xffffd000
Then, # of pages = 2^20

Any data in a page # 2 has address of the form:
0x00002xxx
0x00004000
Lower 12 bits specify which byte you are in the
0x00003000 page:
0x00002200 = 0010 0000 0000
0x00002000 = byte 512
0x00001000
upper bits = page number (PPN)
0x00000000 lower bits = page offset 22
Page Table:Datastructure to store mapping
1 Page Table per process
Lives in Memory, i.e. in a page (or more…)
Location stored in Page Table Base Register
0x00008FFF 9
8
7
... 6
5 C
0000000
0x0000800c 4 B
0000000
1
0x00008008 3
0000000
4
0x00008004 2
0000000
5
0x00008000 1 A
0 0
Part of program state (like PC) Physical
PTBR 0x00008000 Address
Assuming each page = 4KB Space23
Address Translator: MMU
• Programs use virtual
3 A 9
2 B C 8
addresses
1 C 7 • Actual memory uses
0 D B 6
Program #1 C 5 physical addresses
B 4
MMU 3
D 2 Memory Management
A 1
3 0 Unit (MMU)
A
2 B Physical • HW structure
1 C Address
0 D • Translates virtual 
Space
Program #2 Memory physical address
(DRAM) on the fly
24
Simple Page Table Translation
0x00008FFF 0x10045 0xC20A3000

... paddr
0x9000000c
0x4123B 0xABC
0xC20A3
0x90000008 0x4123B 0x90000000
0x90000004 0x10044
0x90000000 0x00000 0x4123BABC
0x4123B000
31 12 11 0
vaddr 0x00002 0xABC
0x10045000
index into the page table page offset
0x10044000

PTBR 0x90000000 0x00000000

Assuming each page = 4KB Memory


25
General Address Translation
• What if the page size is not 4KB?
 Page offset is no longer 12 bits

Clicker Question:
Page size is 16KB  how many bits is page offset?
(a) 12 (b) 13 (c) 14 (d) 15 (e)
16
• What if Main Memory is not 4GB?
 Physical page number is no longer 20 bits
Clicker Question:
Page size 4KB, Main Memory 512 MB
 how many bits is PPN?
(a) 15 (b) 16 (c) 17 (d) 18 (e)
19
26
Virtual Memory: Summary
Virtual Memory: a Solution for All Problems

• Each process has its own virtual address space


 Program/CPU can access any address from 0…2N-1
(N=number of bits in address register)
 A process is a program being executed
 Programmer can code as if they own all of memory

• On-the-fly at runtime, for each memory access


 all accesses are indirect through a virtual address
map  translate fake virtual address to a real physical address
 redirect load/store to the physical address
27
Advantages of Virtual Memory
Easy relocation
• Loader puts code anywhere in physical memory
• Virtual mappings to give illusion of correct layout
Higher memory utilization
• Provide illusion of contiguous memory
• Use all physical memory, even physical address
0x0
Easy sharing
• Different mappings for different programs / cores

And more to come… 28


Takeaway
• All problems in computer science can be solved by
another level of indirection.

• Need a map to translate a “fake” virtual address


(generated by CPU) to a “real” physical Address (in
memory)

• Virtual memory is implemented via a “Map”, a


PageTage, that maps a vaddr (a virtual address) to
a paddr (physical address):
paddr = PageTable[vaddr]
Feedback
• How much did you love today’s lecture?
A: As much as Melania loves Trump
B: As much as Kanye loves Kanye
C: Somewhere in between, but closer to A
D: Somewhere in between, but closer to B
E: I am incapable of loving anything 
MIPS Processor Milestone Celebration!
Virtual Memory Agenda
What is Virtual Memory?
How does Virtual memory Work?
• Address Translation
• Overhead
• Paging
• Performance

32
Page Table Overhead
• How large is PageTable?
• Virtual address space (for each process):
 Given: total virtual memory: 232 bytes = 4GB
 Given: page size: 212 bytes = 4KB
 # entries in PageTable? 220 = 1 million entries
 size of PageTable? PTE size = 4 bytes
• Physical address space: PageTable size = 4 x 220 = 4MB
 total physical memory: 229 bytes = 512MB
 overhead for 10 processes?

10 x 4MB = 40 MB of overhead!
• 40 MB /512 MB = 7.8% overhead,
space due to PageTable 33
But Wait... There’s more!
• Page Table Entry won’t be just an integer
• Meta-Data
 Valid Bits
• What PPN means “not mapped”? No such number…
• At first: not all virtual pages will be in physical memory
• Later: might not have enough physical memory to map
all virtual pages
 Page Permissions
• R/W/X permission bits for each PTE
• Code: read-only, executable
• Data: writeable, not executable

34
Less Simple Page Table
Physical Page
V R W X Number
0 0xC20A3000
1 1 1 0 0xC20A3 r t ual
ra l vi e
0 s eve l pag
0 p i ng ys i ca
: m ap e ph
1 1 0 0 0xC20A3 s i ng
 s am 0x90000000
i a
A l es s es
1 0x4123B r
add
1 0x10044
0
0x4123B000
Process tries to access a page without
proper permissions 0x10045000
Segmentation Fault
0x10044000
Example:
Write to read-only?  process killed 0x00000000 35
Now how big is this Page Table?
struct pte_t page_table[220]
Each PTE = 8 bytes
How many pages in memory will the page table take
up?

Clicker Question: (a) 4 million (222) pages


(b) 2048 (211) pages
(c) 1024 (210) pages
(d) 4 billion (232) pages
(e) 4K (212) pages

36
Assuming each page = 4KB
Now how big is this Page Table?
struct pte_t page_table[220]
Each PTE = 8 bytes
How many pages in memory will the page table take
up?

Clicker Question: (a) 4 million (222) pages


(b) 2048 (211) pages
(c) 1024 (210) pages
(d) 4 billion (232) pages
(e) 4K (212) pages

37
Assuming each page = 4KB
Wait, how big is this Page Table?
page_table[220] = 8x220 =223 bytes
(Page Table = 8 MB in size)

How many pages in memory will the page table take up?
223 /212 =211 2K pages!

Clicker Question: (a) 4 million (222) pages


(b) 2048 (211) pages
(c) 1024 (210) pages
(d) 4 billion (232) pages
(e) 4K (212) pages
38
Assuming each page = 4KB
Takeaway
• All problems in computer science can be solved by another level of
indirection.
• Need a map to translate a “fake” virtual address (generated by
CPU) to a “real” physical Address (in memory)

• Virtual memory is implemented via a “Map”, a PageTage, that


maps a vaddr (a virtual address) to a paddr (physical address):
• paddr = PageTable[vaddr]

• A page is constant size block of virtual memory. Often, the page


size will be around 4kB to reduce the number of entries in a
PageTable.

• We can use the PageTable to set Read/Write/Execute permission


on a per page basis. Can allocate memory on a per page basis.
Need a valid bit, as well as Read/Write/Execute and other bits.
Next Goal
• How do we reduce the size (overhead) of the
PageTable?
Next Goal
• How do we reduce the size (overhead) of the
PageTable?

• A: Another level of indirection!!


Single-Level Page Table
20 bits 12 bits vaddr
31 12 11 0

is m y
h er e
W ica l
phy s
?
p age Total size = 220 * 4 bytes
= 4MB
PTEntry
PPN

PTBR
Page Table

42
Multi-Level Page Table
10 bits 10 bits 12 bits vaddr
31 22 21 12 11 0

is m y
h er e
W ica l
phy s
?
e r e is m y page
W h n ?
s la ti o
tr a n PTEntry
PPN

PDEntry
Page Table
Also referred to as
PTBR
Page Directory Level 1 and Level 2
* Indirection to the Rescue, AGAIN! Page Tables43
Multi-Level Page Table
10 bits 10 bits 12 bits vaddr
31 22 21 12 11 0
Assuming each entry is
is y
m 4bytes,What is the size of
r e
Whe ical Page Directory?
phys
? A: 2KB B: 2MB
e r e is m y page
W h n ? C: 4KB D: 4MB
s la ti o
tr a n PTEntry
PPN

PDEntry
Page Table
Also referred to as
PTBR
Page Directory Level 1 and Level 2
* Indirection to the Rescue, AGAIN! Page Tables44
Multi-Level Page Table
10 bits 10 bits 12 bits vaddr
31 22 21 12 11 0

is m y Assuming each entry is


h er e 4bytes,What is the total
W ica l
phy s size of ALL Page tables?
?
e r e is m y page A: 2KB B: 2MB
W h n ?
s la ti o C: 4KB D: 4MB
tr a n PTEntry
PPN

PDEntry
Page Table
Also referred to as
PTBR
Page Directory Level 1 and Level 2
* Indirection to the Rescue, AGAIN! Page Tables45
Multi-Level Page Table
10 bits 10 bits 12 bits vaddr
31 22 21 12 11 0

is m y
h er e
W ica l
phy s
?
e r e is m y page
W h n ?
s la ti o
tr a n PTEntry
PPN

PDEntry
Page Table
PTBR Size = 210 * 210 *4 bytes = 4MB
Page Directory # entries per page table
# page tables
Size = 210 * 4 bytes = 4KB 46
Multi-Level Page Table
Doesn’t this take up more memory than before?
- YES, but..
Benefits
• Don’t need 4MB contiguous physical memory
• Don’t need to allocate every PageTable, only
those containing valid PTEs

Drawbacks
• Performance: Longer lookups

47
Virtual Memory Agenda
What is Virtual Memory?
How does Virtual memory Work?
• Address Translation
• Overhead
• Paging
• Performance

48
Paging
What if process requirements > physical memory?
Virtual starts earning its name

Memory acts as a cache for secondary storage (disk)


 Swap memory pages out to disk when not in use
 Page them back in when needed

Courtesy of Temporal & Spatial Locality (again!)


 Pages used recently mostly likely to be used again

More Meta-Data:
• Dirty Bit, Recently Used, etc.
• OS may access this meta-data to choose a victim 49
Paging
Physical Page
V RWX D Number 0xC20A3000
0 --
1 1 0 1 0 0x10045
0 -- 0x90000000
0 --
0 0 disk sector 200 0x4123B000
0 0 disk sector 25
1 1 1 0 1 0x00000
0x10045000
0 --
0x00000000
Example: accessing address beginning
with 0x00003 (PageTable[3]) results in
a Page Fault which will page the data 200
in from disk sector 200 25
50
Page Fault
Valid bit in Page Table = 0
 means page is not in memory

OS takes over:
• Choose a physical page to replace
 “Working set”: refined LRU, tracks page usage
• If dirty, write to disk
• Read missing page from disk
 Takes so long (~10ms), OS schedules another task

Performance-wise page faults are really bad!


51
Virtual Memory Agenda
What is Virtual Memory?
How does Virtual memory Work?
• Address Translation
• Overhead
• Paging
• Performance

52
Watch Your Performance Tank!
For every instruction:
• MMU translates address (virtual  physical)
 Uses PTBR to find Page Table in memory
 Looks up entry for that virtual page
• Fetch the instruction using physical address
 Access Memory Hierarchy (I$  L2  Memory)

• Repeat at Memory stage for load/store insns


 Translate address
 Now you perform the load/store

53
Performance
• Virtual Memory Summary
• PageTable for each process:
 Page
• Single-level (e.g. 4MB contiguous in physical memory)
• or multi-level (e.g. less mem overhead due to page table),
•…
 every load/store translated to physical addresses
 page table miss: load a swapped-out page and retry
instruction, or kill program
• Performance?
 terrible: memory is already slow
translation makes it slower
• Solution?

Next Goal
• How do we speedup address translation?
Translation Lookaside Buffer (TLB)
• Small, fast cache
• Holds VPNPPN translations
• Exploits temporal locality in pagetable
• TLB Hit: huge performance savings
• TLB Miss: invoke TLB miss handler
• Put translation in TLB for later

CPU VA
“tag” “data”
VPN PPN
VA VPN PPN
VA VPN PPN
MMU TLB
PA PA
56
TLB Parameters
Typical
• very small (64 – 256 entries)  very fast
• fully associative, or at least set associative

Example: Intel Nehalem TLB


• 128-entry L1 Instruction TLB, 4-way LRU
• 64-entry L1 Data TLB, 4-way LRU
• 512-entry L2 Unified TLB, 4-way LRU

57
TLB to the Rescue!
For every instruction:
• Translate the address (virtual  physical)
 CPU checks TLB
 That failing, walk the Page Table
• Use PTBR to find Page Table in memory
• Look up entry for that virtual page
• Cache the result in the TLB
• Fetch the instruction using physical address
 Access Memory Hierarchy (I$  L2  Memory)

• Repeat at Memory stage for load/store insns


 CPU checks TLB, translate if necessary
 Now perform load/store 58
Translation in Action
Virtual Address
deliver
TLB Access Data back
to CPU
ss
TLB mi no TLB yes
l er $ Access
hand Hit?
o r O S) Physical
(HW Address
$ yes
Hit?

no
DRAM
Access

Next Topic: DRAM yes


Hit?
Exceptional Control Flow 59
Takeaways
Need a map to translate a “fake” virtual address (from process) to a “real”
physical Address (in memory).

The map is a Page Table: ppn = PageTable[vpn]

A page is constant size block of virtual memory. Often ~4KB to reduce the
number of entries in a PageTable.

Page Table can enforce Read/Write/Execute permissions on a per page basis.


Can allocate memory on a per page basis. Also need a valid bit, and a few
others.

Space overhead due to Page Table is significant.


Solution: another level of indirection!
Two-level of Page Table significantly reduces overhead.

Time overhead due to Address Translations also significant.


Solution: caching! Translation Lookaside Buffer (TLB) acts as a cache for the 60
Page Table and significantly improves performance.

You might also like