0% found this document useful (0 votes)
4 views

l19 VM Linux

Uploaded by

kr gael
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

l19 VM Linux

Uploaded by

kr gael
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

W4118 Operating Systems

Instructor: Junfeng Yang


Logistics
 Homework 5 due 3:09pm this Thursday

2
Last lecture: LRU page
replacement
 Least-Recently-Used: throw out a page that
has not been referenced for the longest time
 With locality, approximate OPT
 However, difficult to implement efficiently

 Clock: efficient, approximate LRU


 Clock extension: bias toward replacing clean
pages
 Problem with LRU
 Doesn’t work well repeated scans with data set larger
than memory
 Doesn’t consider frequency
• LFU
• LRU-k and LRU-2Q

3
Today
 Linux Memory Management
 Address Space Translation
• How does Linux translate address?

 Address Space Layout


• What does a process Address Space look like?
• What does the kernel Address Space look like?
• How to implement Copy-on-Write (COW)?

 Dynamic Memory Allocation


• How to allocate pages?
• How to allocate objects?

 Virtual Memory
• How to replace pages?
• How to load page from disk?

4
Address Translation
 Linux uses paging to translate logical
addresses to physical addresses. Linux
does not use segmentation
 More portable since some RISC architectures
don’t support segmentation
 Hierarchical paging is flexible enough

5
Recall: x86 segmentation and
paging hardware
 CPU generates logical address (seg, offset)
 Given to segmentation unit
• Which produces linear addresses
 Linear address given to paging unit
• Which generates physical address in main memory
• Paging units form equivalent of MMU

6
Segmentation
 Since x86 segmentation hardware cannot be
disabled, Linux just uses NULL mappings
 Set segment base to 0x00000000, limit to 0xffffffff
 segment offset == linear addresses
 arch/i386/kernel/head.S

 Linux defines four segments


 User code (__USER_CS)
 User data (__USER_DS)
 Kernel code (__KERNEL_CS)
 Kernel data (__KERNEL_DATA)
Segment Protection
 Current Privilege level (CPL) specifies
privileged mode or user mode
 Stored in current code segment descriptor
 User code segment: CPL = 3
 Kernel code segment: CPL = 0

 Descriptor Privilege Level (DPL) specifies


protection
 Only accessible if CPL <= DPL

 Switch between user mode and kernel mode


(e.g. system call and return)
 Hardware load the corresponding segment selector
(__USER_CS or __KERNEL_CS) into register cs

8
Paging
 Linux uses 4-level hierarchical paging
 A linear address is split into five parts, to
seamlessly handle a range of different addressing
modes
 Page Global Dir
 Page Upper Dir
 Page Middle Dir
 Page Table
 Page Offset

 Example: 32-bit address space, 4KB page without


physical address extension
 Page Global dir: 10 bits
 Page Upper dir and Page Middle dir are not used
 Page Table: 10 bits
 Page Offset: 12 bits

9
Page Table Operations
 Linux provides data structures and
operations to create, delete, read and write
page directories
 include/asm-i386/pgtable.h
 arch/i386/mm/hugetlbpage.c

 Naming convention
 pgd: Page Global Directory
 pmd: Page Middle Directory
 pud: Page Upper Directory
 pte: Page Table Entry
 Example: mk_pte(p, prot)

10
TLB Operations
 x86 uses hardware TLB
 OS does not manage TLB

 Only operation: flush TLB entries


 include/asm-i386/tlbflush.h
 load cr3: flush all TLB entries
 invlpg addr: flush a single TLB entry
• Much more efficient than flushing all TLB entries

11
Today
 Linux Memory Management
 Address Space Translation
• How does Linux translate address?

 Address Space Layout


• What does a process Address Space look like?
• What does the kernel Address Space look like?
• How to implement Copy-on-Write (COW)?

 Dynamic Memory Allocation


• How to allocate pages?
• How to allocate objects?

 Virtual Memory
• How to replace pages?
• How to load page from disk?

12
Recall: Linux Process Address
Space
0xFFFFFFFF Kernel data
kernel mode structures
Kernel space

0xC0000000 User-mode stack-area Kernel space is


also mapped into
user space 
User space from user mode
to kernel mode,
user mode
no need to switch
Shared runtime-libraries address spaces

Task’s code and data

0
Kernel Address Space
 Kernel Address Space [3G, 4G)
 [0, 896 MB) Physical Memory Mapping
• Kernel uses this mapping to manage physical memory
• physical address = logical address – 3G
 The rest 128 MB are used for
• Accessing High Memory
• Non-contiguous pages
• Fix-mapped Linear Addresses

Persistent
Physical Fix-mapped
vmalloc …vmalloc High
Memory Linear
area area Memory
Mapping addresses
Mappings

0xC0000000 0xFFFFFFFF
14
A Cool Trick: Copy-on-Write
 In fork(), parent and child often share
significant amount of memory
 Expensive to do copy all pages

 COW Idea: exploit VA to PA indirection


 Instead of copying all pages, share parent pages
in child address space
 If either process writes to shared pages, only then
is the page copied
 How to detect page write?
• Mark pages as read-only in both parent and child
address space
• On write, page fault occurs

15
Sharing Pages
 copy_process() in kernel/fork.c
 copy_mm()
 dup_mmap() // copy page tables
 copy_page_range() in mm/memory.c
 copy_pud_range()
 copy_pmd_range()
 copy_pte_range()
 copy_one_pte() // mark
readonly

16
Copy Page on Page Fault
 do_page_fault in arch/i386/mm/fault.c
 cr2 stores faulting virtual address
 handle_mm_fault in mm/memory.c
 handle_pte_fault in mm/memory.c
 if(write_access)
 do_wp_page()

17
Today
 Linux Memory Management
 Address Space Translation
• How does Linux translate address?

 Address Space Layout


• What does a process Address Space look like?
• What does the kernel Address Space look like?
• How to implement Copy-on-Write (COW)?

 Dynamic Memory Allocation


• How to allocate pages?
• How to allocate objects?

 Virtual Memory
• How to replace pages?
• How to load page from disk?

18
Dynamic Memory Allocation
 How to allocate pages?
 Data structures for page allocation
 Buddy algorithm for page allocation

 How to allocate objects?


 Slab allocation

19
Page Descriptor
 Keep track of the status of each page
frame
 struct page, include/linux/mm.h

 All stored in mem_map array

 Simple mapping between a page and its


descriptor
 Nth page’s descriptor is mem_map[N]
 virt_to_page
 page_to_pfn
Memory Zone
 Keep track of pages in different zones
 struct zone, include/linux/mmzone.h
 ZONE_DMA: <16MB
 ZONE_NORMAL: 16MB-896MB
 ZONE_HIGHMEM: >896MB
Page Allocator
 Linux use a buddy allocator for page allocation
 Buddy Allocator: Fast, simple allocation for blocks that
are 2^n bytes [Knuth 1968]

 Data structure: free lists of blocks of pages of size


2^0, 2^1, …
 Allocation restrictions: 2^n pages
 Allocation of 2^n pages:
 Search free lists (n, n+1, n+2, …) for appropriate size
• Recursively divide larger blocks until reach block of correct
size
• Insert “buddy” blocks into free lists
 Free
 Recursively coalesce block with buddy if buddy free

 Example: allocate a 256-page block


 mm/page_alloc.c
Advantages and Disadvantages of
Buddy Allocator
 Advantages
 Fast and simple compared to general dynamic
memory allocation
 Avoid external fragmentation by keeping free
physical pages contiguous
• Can use paging, but three problems:
– DMA bypasses paging
– Modifying page table leads to TLB flush
– Cannot use “super page” to increase TLB coverage

 Disadvantages
 Internal fragmentation
• Allocation of block of k pages when k != 2^n
Slab Allocator
 For objects smaller than a page
 Implemented on top of page allocator
 Each memory region is called a cache

 Two types of slab allocator


 Fixed-size slab allocator: cache contains objects of
same size
• for frequently allocated objects

 General-purpose slab allocator: caches contain


objects of size 2^n
• for less frequently allocated objects
• For allocation of object with size k, round to nearest 2^n

 mm/slab.c
Advantages and Disadvantages of slab
allocator
 Advantages
 Reduce internal fragmentation: many objects in
one page
 Fast: no need to allocate and free page frames
• Allocation: no search of objects with the right size
for fixed-size allocator; simple search for general-
purpose allocator
• Free: no merge with adjacent free blocks

 Disadvantages
 Memory overhead for bookkeeping
 Internal fragmentation for general-purpose slab
allocator
Today
 Linux Memory Management
 Address Space Translation
• How does Linux translate address?

 Address Space Layout


• What does a process Address Space look like?
• What does the kernel Address Space look like?
• How to implement Copy-on-Write (COW)?

 Dynamic Memory Allocation


• How to allocate pages?
• How to allocate objects?

 Virtual Memory
• How to replace pages?
• How to load page from disk?

26
Data Structures for Page
Replacement
 Two lists in struct zone
 active_list: hot pages
 inactive_list: cold pages

 Two bits in struct page


 PG_active: is page on active list?
 PG_referenced: has page been referenced
recently?

27
Functions for Page Replacement
 lru_cache_add*(): add to inactive or active
list
 mark_page_accessed(): called twice to
move a page from inactive to active
 page_referenced(): test if a page is
referenced
 refill_inactive_zone(): move pages from
active to inactive
When to Replace Pages?
 Usually when free_more_memory() in
fs/buffer.c called
 try_to_free_pages in mm/vmscan.c
 shrink_caches
 shrink_zone
 refill_inactive_zone
 shrink_cache
 shrink_list
 pageout()

29
How to Load Page?
 do_page_fault() in arch/i386/mm/fault.c
 cr2 stores faulting virtual address
 handle_mm_fault() in mm/memory.c
 handle_pte_fault()
 if(!pte_present(entry))
 do_no_page() // demand paging
 do_file_page() // file mapped
page
 do_swap_page() // swapped out
page

30

You might also like