0% found this document useful (0 votes)
5 views118 pages

Lecture 21

Uploaded by

minulo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views118 pages

Lecture 21

Uploaded by

minulo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 118

CSC 252/452: Computer Organization

Fall 2024: Lecture 21

Instructor: Yanan Guo

Department of Computer Science


University of Rochester
Carnegie Mellon

Virtual Memory
Virtual Memory Page Table Hard Drive


of Process 1

of Process 1

99 Unallocated Invalid

100 Data 1 1

101 … A

Process 1 102 Data 2 4 A …

103 … B B …
Physical/Main
104 … C C …
Memory
105 … D D …

… … 1 Data 1

2
Virtual Memory Page Table 3 Data 3
of Process 2
… …
of Process 2 4 Data 2

99 … O …
O
100 Unallocated P …
Unallocated
101 … P Q …

102 Data 3 3 R …

103 Data 2 4
Process 2
104 … Q
105 … R

… …
2
Carnegie Mellon

A System Using Virtual Memory


Main memory
0:
CPU Chip 1:
Virtual address Physical address
2:
(VA) (PA)
3:
CPU MMU 4:
4100 4 5:
6:
7:
8:
...

M-1:

Data word

• The memory management unit (MMU) does the VA to PA


translation, and moves data between physical memory and disk.

3
Carnegie Mellon

Calculate Bits in VA and PA


• In a 64-bit machine, VA is 64-bit long. Assuming PM is 4 GB.
Assuming 4 KB page size.

Virtual Page Number offset

Physical Page Number offset

4
Carnegie Mellon

Calculate Bits in VA and PA


• In a 64-bit machine, VA is 64-bit long. Assuming PM is 4 GB.
Assuming 4 KB page size.
• How many bits for page offset?
• 12. Same for VM and PM

Virtual Page Number offset

Physical Page Number offset

4
Carnegie Mellon

Calculate Bits in VA and PA


• In a 64-bit machine, VA is 64-bit long. Assuming PM is 4 GB.
Assuming 4 KB page size.
• How many bits for page offset?
• 12. Same for VM and PM

• How many bits for Virtual Page Number?


• 52, i.e., 252 virtual pages

Virtual Page Number offset

Physical Page Number offset

4
Carnegie Mellon

Calculate Bits in VA and PA


• In a 64-bit machine, VA is 64-bit long. Assuming PM is 4 GB.
Assuming 4 KB page size.
• How many bits for page offset?
• 12. Same for VM and PM

• How many bits for Virtual Page Number?


• 52, i.e., 252 virtual pages

• How many bits for Physical Page Number?


• 20, i.e., 220 physical pages

Virtual Page Number offset

Physical Page Number offset

4
Carnegie Mellon

Today
• VM basic concepts and operation
• Other critical benefits of VM
• Address translation

5
Carnegie Mellon

So Far…

User 1
VA PA
Magic Memory
User 2 data Management
Unit (Part of OS)
data
User n

What does an MMU do?


• Translate address from a VA to PA
• Enforce permissions
• Fetch from disk

6
Carnegie Mellon

Address Translation With a Page Table


Virtual address (issued by CPU)
n-1 p p-1 0

Virtual page number (VPN) Virtual page offset (VPO)

m-1 p p-1 0

Physical page number (PPN) Physical page offset (PPO)

Physical address (what will be used to access


the DRAM)
7
Carnegie Mellon

Address Translation With a Page Table


Virtual address (issued by CPU)
n-1 p p-1 0

Virtual page number (VPN) Virtual page offset (VPO)

Page table (in the physical memory)


Valid Physical page number (PPN)

m-1 p p-1 0

Physical page number (PPN) Physical page offset (PPO)

Physical address (what will be used to access


the DRAM)
7
Carnegie Mellon

Address Translation With a Page Table


Virtual address (issued by CPU)
n-1 p p-1 0
Page table
base register Virtual page number (VPN) Virtual page offset (VPO)
(PTBR)

Page table (in the physical memory)


Valid Physical page number (PPN)
Physical page table
address for the current
process

m-1 p p-1 0

Physical page number (PPN) Physical page offset (PPO)

Physical address (what will be used to access


the DRAM)
7
Carnegie Mellon

Address Translation With a Page Table


Virtual address (issued by CPU)
n-1 p p-1 0
Page table
base register Virtual page number (VPN) Virtual page offset (VPO)
(PTBR)

Page table (in the physical memory)


Valid Physical page number (PPN)
Physical page table
address for the current
process

m-1 p p-1 0

Physical page number (PPN) Physical page offset (PPO)

Physical address (what will be used to access


the DRAM)
7
Carnegie Mellon

Address Translation With a Page Table


Virtual address (issued by CPU)
n-1 p p-1 0
Page table
base register Virtual page number (VPN) Virtual page offset (VPO)
(PTBR)

Page table (in the physical memory)


Valid Physical page number (PPN)
Physical page table
address for the current
process
PTEA = PTBR +
VPN * sizeof (PTE)

m-1 p p-1 0

Physical page number (PPN) Physical page offset (PPO)

Physical address (what will be used to access


the DRAM)
7
Carnegie Mellon

Address Translation With a Page Table


Virtual address (issued by CPU)
n-1 p p-1 0
Page table
base register Virtual page number (VPN) Virtual page offset (VPO)
(PTBR)

Page table (in the physical memory)


Valid Physical page number (PPN)
Physical page table
address for the current
process
PTEA = PTBR +
VPN * sizeof (PTE)

Valid bit = 0:
Page not in memory
(page fault)

m-1 p p-1 0

Physical page number (PPN) Physical page offset (PPO)

Physical address (what will be used to access


the DRAM)
7
Carnegie Mellon

Address Translation With a Page Table


Virtual address (issued by CPU)
n-1 p p-1 0
Page table
base register Virtual page number (VPN) Virtual page offset (VPO)
(PTBR)

Page table (in the physical memory)


Valid Physical page number (PPN)
Physical page table
address for the current
process
PTEA = PTBR +
VPN * sizeof (PTE)

Valid bit = 0:
Page not in memory
Valid bit = 1
(page fault)

m-1 p p-1 0

Physical page number (PPN) Physical page offset (PPO)

Physical address (what will be used to access


the DRAM)
7
Carnegie Mellon

Address Translation: Page Hit


CPU Chip

CPU MMU
Memory

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
8
Carnegie Mellon

Address Translation: Page Hit


CPU Chip
1
VA
CPU MMU
Memory

1) Processor sends virtual address to MMU

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
8
Carnegie Mellon

Address Translation: Page Hit


2
CPU Chip PTEA
1
VA
CPU MMU
Memory

1) Processor sends virtual address to MMU

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
8
Carnegie Mellon

Address Translation: Page Hit


2
CPU Chip PTEA
1
PTE
VA
CPU MMU 3
Memory

1) Processor sends virtual address to MMU


2-3) MMU fetches PTE from page table in memory

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
8
Carnegie Mellon

Address Translation: Page Hit


2
CPU Chip PTEA
1
PTE
VA
CPU MMU 3
Memory
PA
4

1) Processor sends virtual address to MMU


2-3) MMU fetches PTE from page table in memory
4) MMU sends physical address to cache/memory

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
8
Carnegie Mellon

Address Translation: Page Hit


2
CPU Chip PTEA
1
PTE
VA
CPU MMU 3
Memory
PA
4

Data
5

1) Processor sends virtual address to MMU


2-3) MMU fetches PTE from page table in memory
4) MMU sends physical address to cache/memory
5) Cache/memory sends data word to processor

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
8
Carnegie Mellon

Address Translation: Page Fault

CPU Chip

CPU MMU Memory Disk

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
9
Carnegie Mellon

Address Translation: Page Fault

CPU Chip
1
VA
CPU MMU Memory Disk

1) Processor sends virtual address to MMU

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
9
Carnegie Mellon

Address Translation: Page Fault

2
CPU Chip
PTEA
1
VA
CPU MMU Memory Disk

1) Processor sends virtual address to MMU

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
9
Carnegie Mellon

Address Translation: Page Fault

2
CPU Chip
PTEA
1
VA PTE
CPU MMU Memory Disk
3

1) Processor sends virtual address to MMU


2-3) MMU fetches PTE from page table in memory

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
9
Carnegie Mellon

Address Translation: Page Fault


Exception
Page fault handler
4

2
CPU Chip
PTEA
1
VA PTE
CPU MMU Memory Disk
3

1) Processor sends virtual address to MMU


2-3) MMU fetches PTE from page table in memory
4) Valid bit is zero, so MMU triggers page fault exception

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
9
Carnegie Mellon

Address Translation: Page Fault


Exception
Page fault handler
4

2
CPU Chip Victim page
PTEA
1
5
VA PTE
CPU MMU Memory Disk
3

1) Processor sends virtual address to MMU


2-3) MMU fetches PTE from page table in memory
4) Valid bit is zero, so MMU triggers page fault exception
5) Handler identifies victim (and, if dirty, pages it out to disk)

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
9
Carnegie Mellon

Address Translation: Page Fault


Exception
Page fault handler
4

2
CPU Chip Victim page
PTEA
1
5
VA PTE
CPU MMU Memory Disk
3
New page
6

1) Processor sends virtual address to MMU


2-3) MMU fetches PTE from page table in memory
4) Valid bit is zero, so MMU triggers page fault exception
5) Handler identifies victim (and, if dirty, pages it out to disk)
6) Handler pages in new page and updates PTE in memory

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
9
Carnegie Mellon

Address Translation: Page Fault


Exception
Page fault handler
4

2
CPU Chip Victim page
PTEA
1
5
VA PTE
CPU MMU Memory Disk
7 3
New page
6

1) Processor sends virtual address to MMU


2-3) MMU fetches PTE from page table in memory
4) Valid bit is zero, so MMU triggers page fault exception
5) Handler identifies victim (and, if dirty, pages it out to disk)
6) Handler pages in new page and updates PTE in memory
7) Handler returns to original process, restarting faulting instruction
VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
9
Carnegie Mellon

Integrating VM and Cache

CPU Chip

CPU MMU Memory

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
10
Carnegie Mellon

Integrating VM and Cache

CPU Chip

CPU MMU Memory

Cache

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
10
Carnegie Mellon

Integrating VM and Cache

CPU Chip

CPU MMU Memory


VA

Cache

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
10
Carnegie Mellon

Integrating VM and Cache

CPU Chip

PTEA
CPU MMU Memory
VA

Cache

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
10
Carnegie Mellon

Integrating VM and Cache

CPU Chip

PTEA PTEA PTEA


miss
CPU MMU Memory
VA

Cache

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
10
Carnegie Mellon

Integrating VM and Cache

CPU Chip PTE

PTEA PTEA PTEA


miss
CPU MMU Memory
VA

Cache

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
10
Carnegie Mellon

Integrating VM and Cache

PTE

CPU Chip PTE


PTEA
hit

PTEA PTEA PTEA


miss
CPU MMU Memory
VA

Cache

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
10
Carnegie Mellon

Integrating VM and Cache

PTE

CPU Chip PTE


PTEA
hit

PTEA PTEA PTEA


miss
CPU MMU Memory
VA
PA

Cache

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
10
Carnegie Mellon

Integrating VM and Cache

PTE

CPU Chip PTE


PTEA
hit

PTEA PTEA PTEA


miss
CPU MMU Memory
VA
PA PA PA
miss

Cache

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
10
Carnegie Mellon

Integrating VM and Cache

PTE

CPU Chip PTE


PTEA
hit

PTEA PTEA PTEA


miss
CPU MMU Memory
VA
PA PA PA
miss

Data

Cache

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
10
Carnegie Mellon

Integrating VM and Cache

PTE

CPU Chip PTE


PTEA
hit

PTEA PTEA PTEA


miss
CPU MMU Memory
VA
PA PA PA
miss

PA Data
hit

Data Cache

VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address
10
Carnegie Mellon

Today
• Three Virtual Memory Optimizations
• TLB
• Virtually-indexed, physically-tagged cache
• Page the page table (a.k.a., multi-level page table)

• Case-study: Intel Core i7/Linux example

11
Carnegie Mellon

Speeding up Address Translation

12
Carnegie Mellon

Speeding up Address Translation


• Problem: Every memory load/store requires two memory
accesses: one for PTE, another for real
• The PTE access is kind of an overhead
• Can we speed it up?

12
Carnegie Mellon

Speeding up Address Translation


• Problem: Every memory load/store requires two memory
accesses: one for PTE, another for real
• The PTE access is kind of an overhead
• Can we speed it up?

• Page table entries (PTEs) are already cached in cache like any
other memory data. But:
• PTEs may be evicted by other data references
• PTE hit still requires a small cache delay

12
Carnegie Mellon

Speeding up Translation with a TLB


• Solution: Translation Lookaside Buffer (TLB)
• Think of it as a dedicated cache for page table
• Small set-associative hardware cache in MMU
• Contains complete page table entries for a small number of pages

13
Carnegie Mellon

Speeding up Translation with a TLB


• Solution: Translation Lookaside Buffer (TLB)
• Think of it as a dedicated cache for page table
• Small set-associative hardware cache in MMU
• Contains complete page table entries for a small number of pages
Tag Set Index

13
Carnegie Mellon

Speeding up Translation with a TLB


• Solution: Translation Lookaside Buffer (TLB)
• Think of it as a dedicated cache for page table
• Small set-associative hardware cache in MMU
• Contains complete page table entries for a small number of pages
Tag Set Index

Set 0 v tag Data v tag Data

Set 1 v tag Data v tag Data


Set T-1 v tag Data v tag Data A Conventional


Data Cache
13
Carnegie Mellon

Speeding up Translation with a TLB


• Solution: Translation Lookaside Buffer (TLB)
• Think of it as a dedicated cache for page table
• Small set-associative hardware cache in MMU
• Contains complete page table entries for a small number of pages
Tag Set Index

Set 0 v tag Data v tag Data


Set Index
selects a set
Set 1 v tag Data v tag Data

Set T-1 v tag Data v tag Data A Conventional


Data Cache
13
Carnegie Mellon

Speeding up Translation with a TLB


• Solution: Translation Lookaside Buffer (TLB)
• Think of it as a dedicated cache for page table
• Small set-associative hardware cache in MMU
• Contains complete page table entries for a small number of pages
Tag Set Index
Compare tag to
decide cache hit/miss

Set 0 v tag Data v tag Data


Set Index
selects a set
Set 1 v tag Data v tag Data

Set T-1 v tag Data v tag Data A Conventional


Data Cache
13
Carnegie Mellon

Accessing the TLB


• MMU uses the Virtual Page Number portion of the virtual
address to access the TLB:

n-1 p p-1 0
TLB tag Virtual
(TLBT) Page
TLBNumber
index (TLBI) Offset

Set 0 v tag PTE v tag PTE

Set 1 v tag PTE v tag PTE


Set T-1 v tag PTE v tag PTE A Page Table


Cache
14
Carnegie Mellon

Accessing the TLB


• MMU uses the Virtual Page Number portion of the virtual
address to access the TLB:
Virtual Page Number

n-1 p+t p+t-1 p p-1 0


TLB tag (TLBT) TLB index (TLBI) Offset

Set 0 v tag PTE v tag PTE

Set 1 v tag PTE v tag PTE


Set T-1 v tag PTE v tag PTE A Page Table


Cache
14
Carnegie Mellon

Accessing the TLB


• MMU uses the Virtual Page Number portion of the virtual
address to access the TLB:
Virtual Page Number

n-1 p+t p+t-1 p p-1 0


TLB tag (TLBT) TLB index (TLBI) Offset

Set 0 v tag PTE v tag PTE


TLBI selects the set
Set 1 v tag PTE v tag PTE

Set T-1 v tag PTE v tag PTE A Page Table


Cache
14
Carnegie Mellon

Accessing the TLB


• MMU uses the Virtual Page Number portion of the virtual
address to access the TLB:
Virtual Page Number
TLBT matches tag of line
within set n-1 p+t p+t-1 p p-1 0
TLB tag (TLBT) TLB index (TLBI) Offset

Set 0 v tag PTE v tag PTE


TLBI selects the set
Set 1 v tag PTE v tag PTE

Set T-1 v tag PTE v tag PTE A Page Table


Cache
14
Carnegie Mellon

TLB Hit

CPU Chip
TLB

CPU MMU
Cache/
Memory

15
Carnegie Mellon

TLB Hit

CPU Chip
TLB

1
VA
CPU MMU
Cache/
Memory

15
Carnegie Mellon

TLB Hit

CPU Chip
TLB
2
VPN

1
VA
CPU MMU
Cache/
Memory

15
Carnegie Mellon

TLB Hit

CPU Chip
TLB
2 PTE
VPN 3

1
VA
CPU MMU
Cache/
Memory

15
Carnegie Mellon

TLB Hit

CPU Chip
TLB
2 PTE
VPN 3

1
VA PA
CPU MMU
4 Cache/
Memory

15
Carnegie Mellon

TLB Hit

CPU Chip
TLB
2 PTE
VPN 3

1
VA PA
CPU MMU
4 Cache/
Memory

Data
5

15
Carnegie Mellon

TLB Hit

CPU Chip
TLB
2 PTE
VPN 3

1
VA PA
CPU MMU
4 Cache/
Memory

Data
5

A TLB hit eliminates a memory access


15
Carnegie Mellon

TLB Miss

CPU Chip
TLB
2
VPN

1
VA
CPU MMU
Cache/
Memory

16
Carnegie Mellon

TLB Miss

CPU Chip
TLB
2
VPN

1 3
VA PTEA
CPU MMU
Cache/
Memory

16
Carnegie Mellon

TLB Miss

CPU Chip
TLB
4
2 PTE
VPN

1 3
VA PTEA
CPU MMU
Cache/
Memory

16
Carnegie Mellon

TLB Miss

CPU Chip
TLB
4
2 PTE
VPN

1 3
VA PTEA
CPU MMU
Cache/
PA Memory
5

16
Carnegie Mellon

TLB Miss

CPU Chip
TLB
4
2 PTE
VPN

1 3
VA PTEA
CPU MMU
Cache/
PA Memory
5

Data
6

16
Carnegie Mellon

Today
• Three Virtual Memory Optimizations
• TLB
• Virtually-indexed, physically-tagged cache
• Page the page table (a.k.a., multi-level page table)

• Case-study: Intel Core i7/Linux example

17
Carnegie Mellon

Performance Issue in VM
• Address translation and cache accesses are serialized
• First translate from VA to PA
• Then use PA to access cache
• Slow! Can we speed it up?

CPU Chip

CPU MMU Memory


VA
PA PA PA
miss

PA Data
hit

Data Cache

18
Carnegie Mellon

Performance Issue in VM

Virtual Virtual page number


Page Offset
Address (VPN)

Physical Physical page


Page Offset
Address number (PPN)

Cache Line Cache


Tag Set Index
Offset

19
Carnegie Mellon

Performance Issue in VM

Virtual Virtual page number


Page Offset
Address (VPN)

Unchanged!!

Physical Physical page


Page Offset
Address number (PPN)

Cache Line Cache


Tag Set Index
Offset

19
Carnegie Mellon

Performance Issue in VM

Virtual Virtual page number


Page Offset
Address (VPN)

Unchanged!!

Physical Physical page


Page Offset
Address number (PPN)

=
Set Cache Line Cache
Tag Tag Set Index
Index Offset

19
Carnegie Mellon

Performance Issue in VM

Virtual Virtual page number


Page Offset
Address (VPN)

Unchanged!!

Physical Physical page


Page Offset
Address number (PPN)

=
Set Cache Line Cache
Tag Tag Set Index
Index Offset

• Set Index + Cache Line Offset = Page Offset


• Indexing into cache in parallel with translation (TLB access)
• If TLB hits, can get the data back in one cycle 19
Carnegie Mellon

Performance Issue in VM

Virtual Virtual page number


Page Offset
Address (VPN)

Unchanged!!

Physical Physical page Virtually-Indexed,


Page Offset
Address number (PPN) Physically-Tagged
Cache

=
Set Cache Line Cache
Tag Tag Set Index
Index Offset

• Set Index + Cache Line Offset = Page Offset


• Indexing into cache in parallel with translation (TLB access)
• If TLB hits, can get the data back in one cycle 19
Carnegie Mellon

Any Implications?

Virtual Virtual page number


Page Offset
Address (VPN)

Physical Set Cache Line


Tag
Address Index Offset

20
Carnegie Mellon

Any Implications?
12 bits
Virtual Virtual page number
Page Offset
Address (VPN)

4 bits

Physical Set Cache Line


Tag
Address Index Offset

• Assuming 4K page size, cache line size is 16 bytes.

20
Carnegie Mellon

Any Implications?
12 bits
Virtual Virtual page number
Page Offset
Address (VPN)

8 bits 4 bits

Physical Set Cache Line


Tag
Address Index Offset

• Assuming 4K page size, cache line size is 16 bytes.


• Set Index = 8 bits. Can only have 256 Sets => Limit cache size

20
Carnegie Mellon

Any Implications?
12 bits
Virtual Virtual page number
Page Offset
Address (VPN)

8 bits 4 bits

Physical Set Cache Line


Tag
Address Index Offset

• Assuming 4K page size, cache line size is 16 bytes.


• Set Index = 8 bits. Can only have 256 Sets => Limit cache size
• Increasing cache size then requires increasing associativity

20
Carnegie Mellon

Any Implications?
12 bits
Virtual Virtual page number
Page Offset
Address (VPN)

8 bits 4 bits

Physical Set Cache Line


Tag
Address Index Offset

• Assuming 4K page size, cache line size is 16 bytes.


• Set Index = 8 bits. Can only have 256 Sets => Limit cache size
• Increasing cache size then requires increasing associativity
• Not ideal because that requires comparing more tags

20
Carnegie Mellon

Any Implications?
12 bits
Virtual Virtual page number
Page Offset
Address (VPN)

8 bits 4 bits

Physical Set Cache Line


Tag
Address Index Offset

• Assuming 4K page size, cache line size is 16 bytes.


• Set Index = 8 bits. Can only have 256 Sets => Limit cache size
• Increasing cache size then requires increasing associativity
• Not ideal because that requires comparing more tags
• Solutions?
20
Carnegie Mellon

Any Implications?
12 bits
Virtual Virtual page number
Page Offset
Address (VPN)

9 bits 4 bits

Physical Set Cache Line


Tag
Address Index Offset

• What if we use 9 bits for Set Index? More Sets now.

21
Carnegie Mellon

Any Implications?
12 bits
Virtual Virtual page number
Page Offset
Address (VPN)

9 bits 4 bits

Physical Set Cache Line


Tag
Address Index Offset

• What if we use 9 bits for Set Index? More Sets now.


• How can this still work?

21
Carnegie Mellon

Any Implications?
12 bits
Virtual Virtual page number
Page Offset
Address (VPN)

9 bits 4 bits

Physical Set Cache Line


Tag
Address Index Offset

• What if we use 9 bits for Set Index? More Sets now.


• How can this still work?
• The least significant bit in VPN and PPN must be the same

21
Carnegie Mellon

Any Implications?
12 bits
Virtual Virtual page number
Page Offset
Address (VPN)

9 bits 4 bits

Physical Set Cache Line


Tag
Address Index Offset

• What if we use 9 bits for Set Index? More Sets now.


• How can this still work?
• The least significant bit in VPN and PPN must be the same
• That is: an even VA must be mapped to an even PA, and an odd
VA must be mapped to an odd PA

21
Carnegie Mellon

Today
• Three Virtual Memory Optimizations
• TLB
• Virtually-indexed, physically-tagged cache
• Page the page table (a.k.a., multi-level page table)

• Case-study: Intel Core i7/Linux example

22
Carnegie Mellon

Where Does Page Table Live?

23
Carnegie Mellon

Where Does Page Table Live?


• It needs to be at a specific location where we can find it
• In main memory, with its start address stored in a special
register (PTBR)

23
Carnegie Mellon

Where Does Page Table Live?


• It needs to be at a specific location where we can find it
• In main memory, with its start address stored in a special
register (PTBR)
• Assume 4KB page, 48-bit virtual memory, each PTE is 8 Bytes
• 236 PTEs in a page table
• 512 GB total size per page table??!!

23
Carnegie Mellon

Where Does Page Table Live?


• It needs to be at a specific location where we can find it
• In main memory, with its start address stored in a special
register (PTBR)
• Assume 4KB page, 48-bit virtual memory, each PTE is 8 Bytes
• 236 PTEs in a page table
• 512 GB total size per page table??!!

• Problem: Page tables are huge


• One table per process!
• Storing them all in main memory wastes space

23
Carnegie Mellon

Solution: Page the Page Table


• Observation: Only a small number of pages (working set) are
accessed during a certain period of time, due to locality
• Put only the relevant page table entires in main memory
• Idea: Put page table in Virtual Memory and swap it just like data
VM
PM

24
Carnegie Mellon

Solution: Page the Page Table


• Observation: Only a small number of pages (working set) are
accessed during a certain period of time, due to locality
• Put only the relevant page table entires in main memory
• Idea: Put page table in Virtual Memory and swap it just like data
VM
PM

24
Carnegie Mellon

Solution: Page the Page Table


• Observation: Only a small number of pages (working set) are
accessed during a certain period of time, due to locality
• Put only the relevant page table entires in main memory
• Idea: Put page table in Virtual Memory and swap it just like data
VM
PM

24
Carnegie Mellon

Solution: Page the Page Table


• Observation: Only a small number of pages (working set) are
accessed during a certain period of time, due to locality
• Put only the relevant page table entires in main memory
• Idea: Put page table in Virtual Memory and swap it just like data
VM
PM

24
Carnegie Mellon

Solution: Page the Page Table


• Observation: Only a small number of pages (working set) are
accessed during a certain period of time, due to locality
• Put only the relevant page table entires in main memory
• Idea: Put page table in Virtual Memory and swap it just like data
VM
PM

24
Carnegie Mellon

Solution: Page the Page Table


• Observation: Only a small number of pages (working set) are
accessed during a certain period of time, due to locality
• Put only the relevant page table entires in main memory
• Idea: Put page table in Virtual Memory and swap it just like data
VM
PM
Virtual address

24
Carnegie Mellon

Solution: Page the Page Table


• Observation: Only a small number of pages (working set) are
accessed during a certain period of time, due to locality
• Put only the relevant page table entires in main memory
• Idea: Put page table in Virtual Memory and swap it just like data
VM
PM
Virtual address

24
Carnegie Mellon

Solution: Page the Page Table


• Observation: Only a small number of pages (working set) are
accessed during a certain period of time, due to locality
• Put only the relevant page table entires in main memory
• Idea: Put page table in Virtual Memory and swap it just like data
VM
PM
Virtual address

24
Carnegie Mellon

Effectively: A 2-Level Page Table


Level 2
• Level 1 table:
Tables
• Always in memory at a known location.
• Each L1 PTE points to the start address
of a L2 page table. Level 1
• Bring that table to memory on-demand. Table
• Level 2 table:
• Each PTE points to an actual data page ...

...
25
Carnegie Mellon

A Two-Level Page Table Hierarchy


Virtual
memory
VP 0
...
VP 1023
VP 1024
...
VP 2047

unallocated
pages

unallocated
pages
VP 9215
32 bit addresses, 4KB pages, 4-byte PTEs
... 26
Carnegie Mellon

A Two-Level Page Table Hierarchy


Level 2 Virtual
page tables memory
VP 0
...
PTE 0
VP 1023
...
VP 1024
PTE 1023
...
VP 2047
PTE 0
...
PTE 1023
unallocated
pages

1023 null
PTEs
PTE 1023 unallocated
pages
VP 9215
32 bit addresses, 4KB pages, 4-byte PTEs
... 26
Carnegie Mellon

A Two-Level Page Table Hierarchy


Level 1 Level 2 Virtual
page table page tables memory
VP 0
...
PTE 0 PTE 0
VP 1023
PTE 1 ...
VP 1024
PTE 2 (null) PTE 1023
...
PTE 3 (null)
VP 2047
PTE 4 (null) PTE 0
PTE 5 (null) ...
PTE 6 (null) PTE 1023
unallocated
PTE 7 (null)
pages
PTE 8
1023 null
(1K - 9) PTEs
null PTEs PTE 1023 unallocated
pages
VP 9215
32 bit addresses, 4KB pages, 4-byte PTEs
... 26
Carnegie Mellon

A Two-Level Page Table Hierarchy


Level 1 Level 2 Virtual
page table page tables memory • Level 2 page table
VP 0 size:
PTE 0 PTE 0 ... • 232 / 212 * 4 = 4 MB
• Level 1 page table
VP 1023
PTE 1 ...
VP 1024
PTE 2 (null) PTE 1023
... size:
PTE 3 (null)
VP 2047 • (232 / 212 * 4) / 212 *
PTE 4 (null) PTE 0 4 = 4 KB
PTE 5 (null) ...
PTE 6 (null) PTE 1023
unallocated
PTE 7 (null)
pages
PTE 8
1023 null
(1K - 9) PTEs
null PTEs PTE 1023 unallocated
pages
VP 9215
32 bit addresses, 4KB pages, 4-byte PTEs
... 26
Carnegie Mellon

How to Access a 2-Level Page Table?

Page table
base register
(PTBR)
VIRTUAL ADDRESS
n-1 p-1 0
VPN VPO

page table

PPN

m-1 p-1 0
PPN PPO
PHYSICAL ADDRESS

27
Carnegie Mellon

How to Access a 2-Level Page Table?

Page table
base register
(PTBR)
VIRTUAL ADDRESS
n-1 p-1 0
VPN 1 VPN 2 VPO

Level 1 Level 2
page table page table

PPN

m-1 p-1 0
PPN PPO
PHYSICAL ADDRESS

28
Carnegie Mellon

Translating with a k-level Page Table

Page table
base register
(PTBR)
VIRTUAL ADDRESS
n-1 p-1 0
VPN 1 VPN 2 ... VPN k VPO

Level 1 Level 2 Level k


page table page table page table
... ...

PPN

m-1 p-1 0
PPN PPO
PHYSICAL ADDRESS

29
Carnegie Mellon

Today
• Three Virtual Memory Optimizations
• TLB
• Virtually-indexed, physically-tagged cache
• Page the page table (a.k.a., multi-level page table)

• Case-study: Intel Core i7/Linux example

30
Carnegie Mellon

Intel Core i7 Memory System


Processor package
Core x4
Instruction MMU
Registers
fetch (addr translation)

L1 d-cache L1 i-cache L1 d-TLB L1 i-TLB


32 KB, 8-way 32 KB, 8-way 64 entries, 4-way 128 entries, 4-way

L2 unified cache L2 unified TLB


256 KB, 8-way 512 entries, 4-way
To other
QuickPath interconnect CPUs
4 links @ 25.6 GB/s each

L3 unified cache DDR3 Memory controller


8 MB, 16-way 3 x 64 bit @ 10.66 GB/s
(shared by all cores) 32 GB/s total (shared by all cores)

Main memory
31
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

32
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

...

L1 TLB (16 sets, 4 entries/set)

32
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

TLB
hit
...

L1 TLB (16 sets, 4 entries/set)

40
PPN

32
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

TLB
hit
TLB
miss ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40
VPN1 VPN2 VPN3 VPN4
PPN

CR3
PTE PTE PTE PTE

Page tables 32
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

TLB
hit
TLB
miss ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40 12
VPN1 VPN2 VPN3 VPN4
PPN PPO

CR3
PTE PTE PTE PTE

Page tables 32
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

TLB
hit
TLB
miss ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40 12
VPN1 VPN2 VPN3 VPN4
PPN PPO
Physical
CR3 address
PTE PTE PTE PTE (PA)

Page tables 32
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI

TLB
hit
TLB
miss ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40 12 40 6 6
VPN1 VPN2 VPN3 VPN4
PPN PPO CT CI CO
Physical
CR3 address
PTE PTE PTE PTE (PA)

Page tables 32
Carnegie Mellon

End-to-End Core i7 Address Translation


CPU
Virtual address (VA)
36 12
VPN VPO

32 4
TLBT TLBI
L1 d-cache
TLB (64 sets, 8 lines/set)
hit
TLB
miss ... ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40 12 40 6 6
VPN1 VPN2 VPN3 VPN4
PPN PPO CT CI CO
Physical
CR3 address
PTE PTE PTE PTE (PA)

Page tables 32
Carnegie Mellon

End-to-End Core i7 Address Translation


32/64
CPU
Result
Virtual address (VA)
36 12
VPN VPO L1
32 4 hit
TLBT TLBI
L1 d-cache
TLB (64 sets, 8 lines/set)
hit
TLB
miss ... ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40 12 40 6 6
VPN1 VPN2 VPN3 VPN4
PPN PPO CT CI CO
Physical
CR3 address
PTE PTE PTE PTE (PA)

Page tables 32
Carnegie Mellon

End-to-End Core i7 Address Translation


32/64
CPU L2, L3, and
Result
Virtual address (VA) main memory
36 12
VPN VPO L1 L1
hit miss
32 4
TLBT TLBI
L1 d-cache
TLB (64 sets, 8 lines/set)
hit
TLB
miss ... ...

L1 TLB (16 sets, 4 entries/set)


9 9 9 9 40 12 40 6 6
VPN1 VPN2 VPN3 VPN4
PPN PPO CT CI CO
Physical
CR3 address
PTE PTE PTE PTE (PA)

Page tables 32
Carnegie Mellon

Core i7 Level 4 Page Table Entries


63 62 52 51 12 11 9 8 7 6 5 4 3 2 1 0
XD Unused Page physical base address Unused G D A CD WT U/S R/W P=1

Available for OS (page location on disk) P=0

Each entry references a 4K child page. Significant fields:


P: Child page is present in memory (1) or not (0)
R/W: Read-only or read-write access permission for child page
U/S: User or supervisor mode access
WT: Write-through or write-back cache policy for this page
A: Reference bit (set by MMU on reads and writes, cleared by software)
D: Dirty bit (set by MMU on writes, cleared by software)
Page physical base address: 40 most significant bits of physical page address
(forces pages to be 4KB aligned)
XD: Disable or enable instruction fetches from this page.

33

You might also like