0% found this document useful (0 votes)
5 views67 pages

Memory

Uploaded by

bishnu pun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views67 pages

Memory

Uploaded by

bishnu pun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

Memory Management Basics

▶ Don’t have infinite RAM


▶ Do have a memory hierarchy:
▶ Cache (fast)
▶ Main (medium)
▶ Disk (slow)
▶ Memory manager has the job of using this hierarchy to
create an abstraction (illusion) of easily accessible
memory
▶ Two important memory management functions:
▶ Sharing
▶ Protection
Memory Hierarchy

Cost/bit increases Cache Memory


Storage access speed increases Main Memory
Storage capacity decreases
Monoprogramming Model

▶ Only one program at a time in main memory; Can use


whole of the available memory.

User program User program User program

Operating system in RAM

(a) (b) (c)


Logical versus Physical Address Space

▶ A logical address space that is bound to a separate


physical address space.
▶ Logical address: generated by the CPU; also referred to
as virtual address.
▶ Physical address: address generated by the memory
management unit.
▶ Logical and physical addresses are the same in
compile-time and load-time address-binding schemes.
▶ Logical (virtual) and physical addresses differ in
execution-time address-binding scheme.
Program Relocation

▶ Refers to the ability to load and execute a given


program into an arbitrary place in the memory.
▶ Relocation is a mechanism to convert the logical
address into a physical address.
▶ To do this, there is a special register in CPU called
relocation register.
▶ Every address used in the program is relocated as:
▶ effective physical address = Logical address + Contents
of Relocation register
▶ MMU is a special hardware which performs address
binding, uses relocation scheme.
Memory Management Unit (MMU)

▶ MMU generates physical address from virtual address


provided by the program.

Physical addresses
Virtual addresses
CPU MMU Memory Memory Controller

Disk

▶ MMU maps virtual addresses to physical addresses and


puts them on memory bus.
Relocation Types

▶ Two basic types of relocation:


▶ Static Relocation: Formed during the loading of the
program into memory by a loader.
▶ Dynamic Relocation: mapping from the virtual address
space to physical address is performed at
execution-time.
Protection

▶ Providing security from unauthorized usage of


memory.
▶ OS can protect the memory with the help of base and
limit registers.
▶ Base register consist of the starting address of the next
process, the limit register specifies the boundary of
that job.
▶ That is why the limit register is also called a fencing
register.
▶ Hardware Protection Mechanism
Memory Allocation Techniques

▶ Two types:
▶ Contiguous Storage Allocation
▶ Fixed Partition Allocation
▶ Variable Partition Allocation
▶ Non-Contiguous
▶ Paging
▶ Segmentation
Contiguous Storage Allocation

▶ A memory resident program occupies a single


contiguous block of physical memory. The memory is
partitioned into block of different sizes to
accommodate the programs. The partitioning may be:
▶ Fixed Partition allocation
▶ Variable Partition allocation
Fixed Partition Allocation/Multiprogramming with
Fixed Partition

▶ In multiprogramming environment, several programs


reside in primary memory at a time and the CPU passes
its control rapidly between these programs.
▶ One way to support multiprogramming is to divide the
main memory into several partitions each of which is
allocated to a single process.
▶ Depending on how and when the partitions are
created, there may be two types of partitioning:
▶ Static partitioning
▶ Dynamic partitioning
Partitioning Types

▶ Static Partitioning: implies that the division of memory


into number of partitions and its size is made in the
beginning prior to the execution of user programs and
remains fixed thereafter.
▶ Dynamic Partitioning: the size and the number of
partitions are decided during the execution time by the
OS.
Operation Principle

▶ We divide the memory into several fixed size partitions


where each partition will accommodate only one
program for execution. The number of programs (i.e.
degree of multiprogramming) residing in the memory
will be bound by the number of partitions. When a
program terminates, that partition is freed for another
program waiting in a queue.
Multiprogramming with Fixed Partitions

Partition 1 Partition 1

Partition 2 Partition 2

Partition 3 Partition 3

(a) Separate input queues


(b) Single input queue
Modeling Multiprogramming (Probabilistic
viewpoint of CPU usage)

▶ Let p = the fraction of time waiting for I/O to complete.


▶ n = no. of processes in the memory at once.
▶ The probability that all n processes are waiting for I/O
(CPU idle time) = pn
▶ So, CPU utilization = 1 − pn
▶ Following diagram shows the CPU utilization as a
function of n (degree of multiprogramming)
Modeling Multiprogramming

CPU utilization as a function of number of processes in


memory
Modeling Multiprogramming

▶ Example
▶ Let, total memory = 1M = 1000 K
▶ Memory space occupied by OS = 200 K
▶ Memory space taken by an user program = 200 K
▶ Number of processes n = (1000 − 200)/200 = 4
program
▶ CPU utilization = 1 − (0.8)4 = 60%
Modeling Multiprogramming

▶ Add another 1M memory, then


▶ n = (2000 − 200)/200 = 9
▶ CPU utilization = 1 − (0.8)9 = 87%
▶ Improvement = (87 − 60)/60 = 45%
▶ Again add another 1M, then
▶ n = (3000 − 200)/200 = 14
▶ CPU utilization = 1 − (0.8)14 = 96%
▶ Improvement = (96 − 87)/87 = 10% improvement
▶ Conclusion: addition of last 1M is not logical
Advantages of Fixed Partition

▶ Implementation of this allocation scheme is simple.


▶ The overhead of processing is also slow.
▶ It supports multiprogramming.
▶ It requires no special costly hardware.
▶ It makes efficient utilization of processor and I/O
devices.
Disadvantages of Fixed Partitioning

▶ No single program (process) may exceed the size of the


largest partition in a given system.
▶ It does not support a system having dynamic data
structure such as stack, queue, heap etc.
▶ It limits the degree of multiprogramming which in turn
may reduce the effectiveness of short-term scheduling.
▶ The main problem is the wastage of memory by
programs that are smaller than their partitions. This is
known as Internal Fragmentation.
How to Run More Programs than Fit in Main
Memory at Once

▶ Can’t keep all processes in main memory


▶ Too many (hundreds)
▶ Too big (e.g. 200 MB program)
▶ Two approaches
▶ Swap-bring program in and run it for awhile
▶ Virtual memory-allow program to run even if only part
of it is in main memory
Variable-Partition Allocation/Multiprogramming
with Dynamic Partition

▶ Create partitions dynamically to meet the requirements


of each requesting users.
▶ When a process terminates or is swapped-out then the
memory manager can return the vacated space to the
pool of free memory area from which partition
allocations are made.
Multiprogramming with Variable Partitions

▶ It is solution to the wastage in fixed partition.


▶ It allows the jobs to occupy as much space as they need.
Multiprogramming with Variable Partitions

PB 15K

PA 25K PA 25K

OS OS OS OS

(a) (b) (c) (d)

▶ Input queue: PA 25K, PB 15K, PC 20K, PD 20K


Multiprogramming with Variable Partitions

▶ A process can grow if there is an adjacent hole.


▶ Otherwise the growing process is moved to the hole
large enough for it or swap out one or more processes
to disk.
▶ It has also some degree of waste.
▶ When a process finishes and leaves hole, the hole may
not be large enough to place new job.
▶ Thus, variable partition multiprogramming, waste does
occur.
▶ Following two activities should be taken place, to
reduce wastage of memory:
▶ (a) Coalescing
▶ (b) Compaction
External Fragmentation

▶ We have 150 K and P4 is of 100 K only, so 50 K is wasted.


▶ P4 Swapped in
▶ This is actually known as external fragmentation, i.e.
when a free portion is too small to accommodate any
process.
▶ When the size of the memory is large enough for a
requesting process, but it cannot satisfy a request
because it is not contiguous, then the storage is
fragmented into small no. of holes (free spaces).
▶ This results in external fragmentation.
Multiprogramming with Variable Partitions

▶ (a) Coalescing
▶ The process of merging two adjacent holes to form a
single larger hole is called coalescing.
Multiprogramming with Variable Partitions

▶ (b) Compaction
▶ Even when holes are coalesced, no individual hole may
be large enough to hold the job, although the sum of
holes is larger than the storage required for a process.
▶ It is possible to combine all the holes into one big one
by moving all the processes downward as far as
possible; this technique is called memory compaction.
Compaction

FREE 10K
Process-B
10K hole
Process-B
Process-A
Process-A
20K hole
OS OS

(a) (b)
Fixed vs. Variable Partitioning

Fixed Partitioning Variable Partitioning


It is the OS that decides the OS has to decide about par-
partition size only once at tition size, every time a new
the system boot time. process is chosen by long-
term scheduler.
Here, the degree of multi- Here, the degree of multi-
programming is fixed. programming will vary de-
pending on program size.
It leads to internal fragmen- It leads to external frag-
tation. mentation.
IBM-360 DOS and OS/MFT IBM OS/MFT used this ap-
OS used this approach. proach also.
Drawbacks of Compaction

▶ Reallocation info must be maintained.


▶ System must stop everything during compaction.
▶ Memory compaction requires lots of CPU time.
▶ For example:
▶ On a 256MB machine that can copy 4 bytes in 40ns, it
takes 2.7 sec to compact all of memory.
Swapping

▶ If there is not enough main memory to hold all the


currently active processes, the excess processes must
be kept on the disk and brought in to run dynamically.
▶ Swapping consists of moving processes from main
memory and disk.
▶ Relocation may be required during swapping.
Swapping, a Picture

Hole

Process Process

OS OS

(a) (c)
(b)to (g) similar transitions

▶ Can compact holes by copying programs into holes.


This takes too much time.
Bitmaps

Memory

Bitmap
Linked List P: process H: hole

(a) (b) (c)

▶ (a) Picture of memory


▶ (b) Each bit in bitmap corresponds to a unit of storage
(e.g. bytes) in memory
▶ (c) Linked list P: process H: hole
Memory Management with Linked Lists

▶ Each entry in the list specifies a hole (H) or process (P).


▶ This is the address at which it starts, the length, and a
pointer to the next entry.

Hole Starts Length


Process
Memory Management with Linked Lists

X H

OS OS

(a) Before X terminates (c) and (d) further coalescing


(b) After X terminates
Storage Placement Strategies

▶ 1. First fit:
▶ The memory manager allocates the first hole that is big
enough. It stops the searching as soon as it finds a free
hole that is large enough.
▶ Advantages: It is a fast algorithm because it searches as
little as possible.
▶ Disadvantages: Not good in terms of storage utilization.
▶ Next fit:
▶ It works the same way as first fit, except that it keeps
track of where it is whenever it finds a suitable hole.
Storage Placement Strategies

▶ 3. Best fit:
▶ Allocate the smallest hole that is big enough.
▶ Best fit searches the entire list and takes the smallest
hole that is big enough to hold the new process.
▶ Best fit try to find a hole that is close to the actual size
needed.
▶ Advantages: more storage utilization than first fit.
▶ Disadvantages: slower than first fit because it requires
searching whole list at time.
Storage Placement Strategies

▶ 4. Worst fit:
▶ Allocate the largest hole.
▶ It search the entire list, and takes the largest hole,
rather than creating a tiny hole, it produces the largest
leftover hole, which may be more useful.
▶ Advantages: sometimes it has more storage utilization
than first fit and best fit.
▶ Disadvantages: not good for both performance and
utilization.
Memory Allocation Techniques

▶ Problem:
▶ Q: Given the memory partitions of 100K, 500K, 200K,
300K and 600K (in order), how would each of the
First-fit, Best-fit, and Worst-fit algorithms place
processes of 212K, 417K, 112K, and 426K (in order)?
▶ Which algorithm makes the most efficient use of
memory?
First-fit

▶ First-fit: search the list of available memory and


allocate the first block that is big enough
▶ Processes placement:
▶ 212K → 500K partition
▶ 417K → 600K partition
▶ 112K → 288K partition (New partition 288K = 500K -
212K)
▶ 426K must wait
Best-fit

▶ Best-fit: search the entire list of available memory and


allocate the smallest block that is big enough
▶ Processes placement:
▶ 212K → 300K partition
▶ 417K → 500K partition
▶ 112K → 200K partition
▶ 426K → 600K partition
Worst-fit

▶ Worst-fit: search the entire list of available memory and


allocate the largest block. The justification for this
scheme is that the leftover block produced would be
larger and potentially more useful than that produced
by the best-fit approach
▶ Processes placement:
▶ 212K → 600K partition
▶ 417K → 500K partition
▶ 112K → 388K partition
▶ 426K must wait
▶ In this example Best-fit turns out to be the best, since it
allocates memory for all processes
Swapping System Problem

▶ Q: Consider a swapping system in which memory


consists of the following hole sizes in memory order:
10K, 4K, 20K, 18K, 7K, 9K, 12K, and 15K. Which hole is
taken for successive segment requests of:
▶ 1) 12K
▶ II) 10K
▶ III) 9K
▶ for
▶ (a) First-fit?
▶ (b) Best-fit?
▶ (c) Worst-fit?
Swapping System Solution

First-Fit Best-Fit Worst-Fit


12 20 12 20
10 10 10 18
9 18 9 15
Paging

▶ Partition memory into small equal fixed-size chunks


and divide each process into the same size chunks
▶ The chunks of a process are called pages
▶ The chunks of memory are called frames
Paging

▶ Operating system maintains a page table for each


process
▶ Contains the frame location for each page in the
process
▶ Memory address consist of a page number and offset
within the page
Processes and Frames
▶ 1. System with a number of frames allocated
▶ 2. Process A, stored on disk, consists of four pages.
When it comes time to load this process, the operating
system finds four free frames and loads the four pages
of process A into the four frames.
▶ 3. Process B, consisting of three pages, and process C,
consisting of four pages, are subsequently loaded.
▶ 4. Then process B is suspended and is swapped out of
main memory.
▶ 5. Later, all of the processes in main memory are
blocked, and the operating system needs to bring in a
new process, process D, which consists of five pages.
The Operating System loads the pages into the
available frames and updates the page table
Frame number
Main memory
A.0
A.1
Page Table

0 0 0 0
1 - 1 8
2 - 2 9
3 3 10
Process A page table Process B Process C page table

▶ Figure 7.10 Data Structures for the Example of Figure


7.9 at Time Epoch (f)
Pages and Page Frames

▶ Virtual addresses divided up into units, called pages


and the corresponding units in physical memory are
called page frames.
▶ 512 bytes-64 KB range
▶ The pages and page frames are always the same size.
▶ Transfer between RAM and disk is in whole pages
Paging

6-bit page #10-bit offset Process page table


16-bit physical address

16-bit logical address

(a) Paging
Mapping of Pages to Page Frames

28K-32K

24K-28K
20K-24K
16K-20K
Virtual addressPhysical
space memory address
12K-16K
8K-12K
4K-8K
2K-4K

Page frame

▶ 16 bit addresses, 4KB pages


▶ 32 KB physical memory, 16 virtual pages and 8 page
frames
Example

▶ MOV REG, 0
▶ → Virtual address 0 is sent to MMU. The MMU sees that
this virtual address falls in page 0 (0 to 4095), which is
mapped to page frame 2 (8192 to 12287).
▶ → Thus it transforms the address to 8192 & outputs
8192 onto the bus.
▶ Similarly,
▶ MOV REG 8192 is effectively transformed into MOV
REG, 24576.
Page Fault Processing

▶ Present/absent bit tells whether page is in memory.


▶ What happens if address is not in memory?
▶ Trap to the OS
▶ OS picks page to write to disk
▶ Brings page with (needed) address into memory
▶ Re-starts instruction
Page Table

▶ Virtual address-[virtual page number, offset]


▶ Virtual page number used to index into page table to
find page frame number
▶ If present/absent bit is set to 1, attach page frame
number to the front of the offset, creating the physical
address
▶ which is sent on the memory bus
Mapping/Paging Mechanism

▶ Let us see how the incoming virtual address 8196


(0010000000000100 in binary) is mapped to physical
address 24580.
▶ Incoming 16-bit virtual address is split into 4-bit page
number and 12-bit offset.
▶ With 4-bits, we can have 16 pages and with 12-bits for
the offset, we can address 212 = 4096 bytes within a
page.
▶ The page number is used as an index into the page
table, gives number of virtual pages.
▶ Page table contains Present/Absent bit also.
Structure of Page Table Entry

Caching disabled Modified Present/absent Reference P


Page frame number

▶ Frame number: The actual page frame number.


▶ Present (1) / Absent (0) bit: Defines whether the virtual
page is currently mapped or not.
▶ Protection bit: Kinds of access permission;
read/write/execution.
▶ Modified bit: Defines the changed status of the page
since last access.
▶ Referenced bit: Used for replacement strategy.
▶ Caching disabled: Used for the system where the
mapping into device registers rather than memory.
Problems for Paging

▶ Virtual to physical mapping is done on every memory


reference
▶ The page table can be extremely large.
▶ The mapping must be fast
▶ If the virtual address space is large, the page table will
be large.
▶ For example,
▶ Modern computers use virtual addresses of at least
32-bits. With say, a 4-KB page size, a 32-bit address
space has 1 million pages.
Multi-level Page Tables

▶ Want to avoid keeping the entire page table in memory


because it is too big
▶ Hierarchy of page tables
▶ The hierarchy is a page table of page tables
Multilevel Page Tables

Page table
Page table of page tables

(a) (b)
Virtual Memory

▶ Demand paging
▶ Do not require all pages of a process in memory
▶ Bring in pages as required
▶ Page fault
▶ Required page is not in memory
▶ Operating System must swap in required page
▶ May need to swap out a page to make space
▶ Select page to throw out based on recent history
Thrashing

▶ Too many processes in too little memory


▶ Operating System spends all its time swapping
▶ Little or no real work is done
▶ Disk light is on all the time
▶ Solutions
▶ Good page replacement algorithms
▶ Reduce number of processes running
▶ Fit more memory
Translation Lookaside Buffer

▶ Every virtual memory reference causes two physical


memory access
▶ Fetch page table entry
▶ Fetch data
▶ Use special cache for page table
▶ TLB
TLB and Cache Operation

Figure: TLB Operation


Page Replacement Algorithms

▶ Optimal
▶ FIFO (First In First Out)
▶ LFU (Page Based): (Least Frequently Used)
▶ LFU (Frame Based)
▶ LRU (Least Recently Used)
▶ MFU (Most Frequently Used)
▶ clock
Huge Pages
▶ What are Huge Pages?
▶ Memory is managed in blocks called pages; default size is 4KB.
▶ Huge Pages use much larger page sizes (commonly 2MB or 1GB),
reducing the number of pages needed for large memory allocations.
▶ Why Use Huge Pages?
▶ Reduces Page Table Size: Fewer pages mean smaller page tables,
lowering memory overhead.
▶ Improves TLB Efficiency: Translation Lookaside Buffer (TLB) can
cache fewer, larger entries, reducing TLB misses and speeding up
address translation.
▶ Performance: Critical for large applications (e.g., databases like
PostgreSQL, Oracle) to avoid performance bottlenecks.
▶ Example:
▶ A 4GB process with 4KB pages needs 1 million pages and a 4MB page
table.
▶ With 2MB pages, only 2,048 pages and a 16KB page table are needed.
▶ Other Benefits:
▶ Huge Pages are locked in memory and never swapped out, providing
consistent performance.
▶ Reduces kernel bookkeeping and overhead.
Transparent Huge Pages (THP)
▶ What is THP?
▶ Linux feature that automatically uses huge pages (2MB) for large
memory allocations, without requiring application changes.
▶ THP works with anonymous memory and tmpfs/shmem.
▶ Benefits
▶ No App Changes Needed: Applications benefit from huge pages
transparently.
▶ Performance: Reduces TLB misses, page faults, and memory
management overhead, improving throughput for memory-intensive
workloads (e.g., Java VMs, ML frameworks, PostgreSQL).
▶ Automatic Management: THP can promote or demote pages as
needed.
▶ Limitations and Caveats
▶ Fragmentation: Large contiguous memory is needed; fragmentation
can reduce effectiveness or cause latency spikes.
▶ Less Control: Fine-tuned applications (e.g., databases) may prefer
manual HugePages for predictability.
▶ Only 2MB Pages: THP supports only 2MB pages on x86-64.
▶ Example:
▶ PostgreSQL with a 1GB table can use 2MB pages, greatly reducing TLB
misses and improving performance.

You might also like