OS Module 4
OS Module 4
MODULE-4
MEMORY MANAGEMENT
Main Memory Management Strategies
Memory management is concerned with managing the primary memory. Memory consists of
array of bytes or words each with its own address.
Every program to be executed must be in memory. The instruction must be fetched from memory
before it is executed.
In multi-tasking OS memory management is complex, because as processes are swapped in and out
of the CPU, their code and data must be swapped in and out of memory.
Basic Hardware
Main memory, cache and CPU registers in the processors are the only storage spaces that CPU can
access directly.
The program and data must be bought into the memory from the disk, for the process to run. Each
process has a separate memory space and must access only this range of legal addresses. Protection
of memory is required to ensure correct operation. This prevention is provided by hardware
implementation.
Two registers are used - a base register and a limit register.
The base register holds the smallest legal physical memory address.
The limit register specifies the size of the range.
For example, if the base register holds 300040 and limit register is 120900, then the program can
legally access all addresses from 300040 through 420940(300040+120900) (inclusive) as shown
below fig.
The base and limit registers can be loaded only by the operating system, which uses special
privileged instruction. Since privileged instructions can be executed only in kernel mode only the
operating system can load the base and limit registers.
Operating Systems
For example, if the base is at 14000, then an attempt by the user to address location 0 is
dynamically relocated to location 14000; an access to location 346 is mapped to location
14346. The user program never sees the real physical addresses.
The size of the process is thus limited to the size of the physical memory.
Operating Systems
Dynamic Loading
This can be used to obtain better memory-space utilization.
A routine is not loaded until it is called.
Advantages:
1. An unused routine is never loaded.
2. Useful when large amounts of code are needed to handle infrequently occurring cases.
3. Although the total program-size may be large, the portion that is used (and hence loaded) may be
much smaller.
4. Does not require special support from the OS.
Shared libraries
A library may be replaced by a new version, and all programs that reference the library will
automatically use the new one.
Version info. is included in both program & library so that programs won't accidentally
execute incompatible versions.
Swapping
A process must be loaded into memory in order to execute.
If there is not enough memory available to keep all running processes in memory at the
same time, then some processes that are not currently using the CPU may have their
memory swapped out to a fast local disk called the backing store.
Swapping is the process of moving a process from memory to backing store and moving
another process from backing store to memory. Swapping is a very slow process compared
to other operations.
A variant of swapping policy is used for priority-based scheduling algorithms.
Operating Systems
If a higher-priority process arrives and wants service, the memory manager can swap out the
lower-priority process and then load and execute the higher-priority process.
When the higher-priority process finishes, the lower-priority process can be swapped back in
and continued. This variant of swapping is called roll out, roll in.
Normally the process which is swapped out will be swapped back to the same memory space
that is occupied previously and this depends upon address binding.
The system maintains a ready queue consisting of all the processes whose memory images
are on the backing store or in memory and are ready to run.
Swapping depends upon address-binding:
If binding is done at load-time, then process cannot be easily moved to a different location.
If binding is done at execution-time, then a process can be swapped into a different memory-
space, because the physical-addresses are computed during execution-time.
Major part of swap-time is transfer-time; i.e. total transfer-time is directly proportional to
the amount of memory swapped.
Below fig shows Swapping of two processes using a disk as a backing store.
Example:
Assume that the user process is 10 MB in size and the backing store is a standard hard disk with a
transfer rate of 40 MB per second.
The actual transfer of the 10-MB process to or from main memory takes 10000 KB/40000 KB per
second = 1/4 second
= 250 milliseconds.
Assuming that no head seeks are necessary, and assuming an average latency of 8 milliseconds,
the swap time is 258 milliseconds. Since we must both swap out and swap in, the total swap time
is about 516 milliseconds.
Operating Systems
1. Fixed-sized Partitioning
The memory is divided into fixed-sized partitions.
Each partition may contain exactly one process.
The degree of multiprogramming is bound by the number of partitions.
When a partition is free, a process is selected from the input queue and loaded into the free
partition.
When the process terminates, the partition becomes available for another process.
Operating Systems
2.Variable-sized Partitioning
The OS keeps a table indicating which parts of memory are available and which parts are
occupied.
A hole is a block of available memory. Normally, memory contains a set of holes of various
sizes.
Initially, all memory is available for user-processes and considered one large hole.
When a process arrives, the process is allocated memory from a large hole.
If we find the hole, we allocate only as much memory as is needed and keep the remaining
memory available to satisfy future requests.
Three strategies used to select a free hole from the set of available holes:
1. First Fit: Allocate the first hole that is big enough. Searching can start either at the beginning of the
set of holes or at the location where the previous first-fit search ended.
2. Best Fit: Allocate the smallest hole that is big enough. We must search the entire list, unless the list
is ordered by size. This strategy produces the smallest leftover hole.
3. Worst Fit: Allocate the largest hole. Again, we must search the entire list, unless it is sorted by size.
This strategy produces the largest leftover hole.
First-fit and best fit are better than worst fit in terms of decreasing time and storage utilization.
1a. Solution:
Fragmentation
Two types of memory fragmentation:
1. Internal fragmentation
2. External fragmentation
Operating Systems
1. Internal Fragmentation
The general approach is to break the physical-memory into fixed-sized blocks and allocate
memory in units based on block size.
The allocated-memory to a process may be slightly larger than the requested-memory.
The difference between requested-memory and allocated-memory is called internal
fragmentation i.e. Unused memory that is internal to a partition.
2. External Fragmentation
External fragmentation occurs when there is enough total memory-space to satisfy a request
but the available-spaces are not contiguous. (i.e. storage is fragmented into a large number of
small holes).
Both the first-fit and best-fit strategies for memory-allocation suffer from external
fragmentation.
Statistical analysis of first-fit reveals that given N allocated blocks, another 0.5 N blocks will
be lost to fragmentation. This property is known as the 50-percent rule.
Paging
Paging is a memory-management scheme.
Paging is a storage mechanism to retrieve processes from secondary storage to the main
memory as pages.
The primary concept behind paging is to break each process into individual pages.
The primary memory would also be separated into frames.
This permits the physical-address space of a process to be non-contiguous.
This also solves the considerable problem of fitting memory-chunks of varying sizes onto
the backing-store.
Traditionally: Support for paging has been handled by hardware.
Recent designs: The hardware & OS are closely integrated. Basic Method of Paging
The basic method for implementing paging involves breaking physical memory into fixed-
sized blocks called frames and breaking logical memory into blocks of the same size called
pages.
When a process is to be executed, its pages are loaded into any available memory frames
from the backing store.
The backing store is divided into fixed-sized blocks that are of the same size as the memory
frames.
The hardware support for paging is illustrated in Figure.
Operating Systems
To show how to map logical memory into physical memory, consider a page size of 4 bytes and
physical memory of 32 bytes (8 pages) as shown in below figure.
a. Logical address 0 is page 0 and offset 0 and Page 0 is in frame 5. The logical address 0 maps to
physical address [(5*4) + 0]=20.
b. Logical address 3 is page 0 and offset 3 and Page 0 is in frame 5. The logical address 3 maps to
physical address [(5*4) + 3]= 23.
c. Logical address 4 is page 1 and offset 0 and page 1 is mapped to frame 6. So logical address 4 maps
to physical address [(6*4) + 0]=24.
d. Logical address 13 is page 3 and offset 1 and page 3 is mapped to frame 2. So logical address 13
maps to physical address [(2*4) + 1]=9.
In paging scheme, there is no external fragmentation. Any free frame can be allocated to
a process that needs it. But there may exist internal fragmentation.
If the memory requirements of a process do not happen to coincide with page boundaries,
the last frame allocated may not be completely full.
For example, if page size is 2,048 bytes, a process of 72,766 bytes will need 35 pages plus 1,086
bytes. It will be allocated 36 frames, resulting in internal fragmentation of 2,048 - 1,086= 962
bytes.
Operating Systems
When a process arrives in the system to be executed, its size expressed in pages is examined.
Each page of the process needs one frame. Thus, if the process requires n pages, at least n
frames must be available in memory. If n frames are available, they are allocated to this
arriving process.
The first page of the process is loaded in to one of the allocated frames, and the frame number
is put in the page table for this process. The next page is loaded into another frame and its
frame number is put into the page table and so on, as shown in below figure. (a) before
allocation, (b) after allocation.
Figure: Free frames (a) before allocation and (b) after allocation .
Hardware Support
Translation Look aside Buffer
A special, small, fast lookup hardware cache, called a translation look-aside buffer (TLB).
A translation lookaside buffer is a memory cache that stores the recent translations of
virtual memory to physical memory.
It is used to reduce the time taken to access a user memory location. It can be called an
address-translation cache.
Each entry in the TLB consists of two parts: a key (or tag) and a value.
When the associative memory is presented with an item, the item is compared with all keys
simultaneously. If the item is found, the corresponding value field is returned. The search is
fast; the hardware, however, is expensive. Typically, the number of entries in a TLB is small,
often numbering between 64 and 1,024.
The TLB contains only a few of the page-table entries.
Working:
When a logical-address is generated by the CPU, its page-number is presented to the TLB.
If the page-number is found (TLB hit), its frame-number is immediately available and used
to access memory.
If page-number is not in TLB (TLB miss), a memory-reference to page table must be made.
The obtained frame-number can be used to access memory as shown below Figure.
Operating Systems
Disadvantage:
Hardware is expensive.
Some TLBs have wired down entries that can't be removed.
Some TLBs store ASID (address-space identifier) in each entry of the TLB that uniquely
identify each process and provide address space protection for that process.
For example, an 80-percent hit ratio means that we find the desired page number in the TLB 80
percent of the time. If it takes 20 nanoseconds to search the TLB and 100 nanoseconds to access
memory, then a mapped-memory access takes 120 nanoseconds when the page number is in the
TLB. If we fail to find the page number in the TLB (20 nanoseconds), then we must first access
memory for the page table and frame number (100 nanoseconds) and then access the desired byte
in memory (100 nanoseconds), for a total of 220 nanoseconds. Thus the effective access time is,
Effective Access Time (EAT) = 0.80 x 120 + 0.20 x 220 = 140 nanoseconds.
In this example, we suffer a 40-percent slowdown in memory-access time (from 100 to 140
nanoseconds).
For a 98-percent hit ratio we have
Effective Access Time (EAT) = 0.98 x 120 + 0.02 x 220 = 122 nanoseconds.
This increased hit rate produces only a 22 percent slowdown in access time.
Operating Systems
Protection
Memory-protection is achieved by protection-bits for each frame.
The protection-bits are kept in the page-table.
One protection-bit can define a page to be read-write or read-only.
Every reference to memory goes through the page-table to find the correct frame- number.
Firstly, the physical-address is computed. At the same time, the protection-bit is checked to
verify that no writes are being made to a read-only page.
An attempt to write to a read-only page causes a hardware-trap to the OS (or memory
protection violation).
Valid Invalid Bit
This bit is attached to each entry in the page-table.
Valid bit: “valid” indicates that the associated page is
in the process’ logical address space, and is thus a legal
page
Invalid bit: “invalid” indicates that the page is not in
the process’ logical address space
Illegal addresses are trapped by use of validinvalid bit.
The OS sets this bit for each page to allow or disallow
access to the page.
Figure: Valid (v) or invalid (i) bit in a page-table
Shared Pages
An advantage of paging is the possibility of sharing common code.
Re-entrant code (Pure Code) is non-self-modifying code, it never changes during execution.
Two or more processes can execute the same code at the same time.
Each process has its own copy of registers and data-storage to hold the data for the process's
execution.
The data for 2 different processes will be different.
Only one copy of the editor need be kept in physical-memory
Each user's page-table maps onto the same physical copy of the editor, but data pages are
mapped onto different frames.
Disadvantage:
Systems that use inverted page-tables have difficulty implementing shared-memory.
where p1 is an index into the outer page table, and p2 is the displacement within the page of
the inner page table
The address-translation method for this architecture is shown in below figure. Because
address translation works from the outer page table inward, this scheme is also known as a
forward- mapped page table.
Segmentation
Basic Method of Segmentation
This is a memory-management scheme that supports user-
view of memory (Figure 1).
A logical-address space is a collection of segments.
Each segment has a name and a length.
The addresses specify both segment-name and offset within
the segment.
Normally, the user-program is compiled, and the compiler
automatically constructs segments reflecting the input
program.
For ex: The code, Global variables, the heap, from which
memory is allocated, the stacks used by each thread, The
standard C library.
VIRTUAL MEMORYMANAGEMENT
Virtual memory is a technique that allows for the execution of partially loaded process.
Advantages:
i. A program will not be limited by the amount of physical memory that is available user can
be able to write into large virtual space.
ii. Since each program takes less amount of physical memory, more than one program could
be run at the same time which can increase the throughput and CPU utilization.
iii. Less i/o operation is needed to swap or load user program into memory. So, each user
program could run faster.
Operating Systems
Virtual memory is the separation of users’ logical memory from physical memory. This
separation allows an extremely large virtual memory to be provided when there is less physical
memory.
Separating logical memory from physical memory also allows files and memory to be shared
by several different processes through page sharing.
DEMAND PAGING
Demand paging is a memory management technique used by operating systems to
optimize the use of memory resources.
Demand paging system loads pages only on demand.
A demand paging is like a paging system with swapping. When we want to execute a
process, we swap the process into memory otherwise it will not be loaded into memory.
A swapper manipulates the entire processes, whereas a pager manipulates individual pages
of the process.
Major roles of demand paging are as follows.
Operating Systems
Basic concept:
Instead of swapping the whole process the pager swaps only the necessary pages
into memory.
Thus, it avoids reading unused pages and decreases the swap time and amount of
physical memory needed.
The valid-invalid bit scheme can be used to distinguish between the pages that are
on the disk and that are in memory.
With each page table entry, a valid–invalid bit is associated.
(v ⇒ in-memory, i⇒not-in-memory)
Initially valid–invalid bit is set to i on all entries.
Example of a page table snapshot
Operating Systems
During address translation, if valid–invalid bit in page table entry is i ⇒ page fault.
If the bit is valid then the page is both legal and is in memory.
If the bit is invalid, then either page is not valid or is valid but is currently on the disk.
Marking a page as invalid will have no effect if the processes never access that page.
Suppose if it accesses the page which is marked invalid, causes a page fault trap.
This may result in failure of OS to bring the desired page into memory.
Page Fault
If a page is needed that was not originally loaded up, then a page fault trap is generated.
Steps in Handling a Page Fault
1. The memory address requested is first checked, to make sure it was a valid memory
request.
2. If the reference is to an invalid page, the process is terminated. Otherwise, if the
page is not present in memory, it must be paged in.
3. A free frame is located, possibly from a free-frame list.
4. A disk operation is scheduled to bring in the necessary page from disk.
5. After the page is loaded to memory, the process's page table is updated with
the new frame number, and the invalid bit is changed to indicate that this is
now a valid page reference.
6. The instruction that caused the page fault must now be restarted from the beginning.
Operating Systems
Hardware support:
For demand paging the same hardware is required as paging and swapping.
1. Page table: -Has the ability to mark an entry invalid through valid-invalid bit.
2. Secondary memory: -This holds the pages that are not present in main memory.
PAGE REPLACEMENT
Page replacement policy deals with the solution of pages in memory to be replaced by
a new page that must be brought in. When a user process is executing a page fault
occurs.
The hardware traps to the operating system, which checks the internal table to see
that this is a page fault and not illegal memory access.
The operating system determines where the derived page is residing on the disk, and this
finds that there are no free frames on the list of free frames.
When all the frames are in main memory, it is necessary to bring a new page to
satisfy the page fault, replacement policy is concerned with selecting a page
currently in memory to be replaced.
Victim Page
The page that is supported out of physical memory is called victim page.
If no frames are free, the two-page transforms come (out and one in) are read.
Each page or frame may have a dirty (modify) bit associated with the hardware.
The modify bit for a page is set by the hardware whenever any word or byte in the
Operating Systems
page is written into, indicating that the page has been modified.
When we select the page for replacement, we check its modify bit.
If the bit is set, then the page is modified since it was read from the disk.
If the bit was not set, the page has not been modified since it was read into memory.
Therefore, if the copy of the page has not been modified, we can avoid writing the memory
page to the disk, if it is already there. Sum pages cannot be modified.
FIFO Algorithm:
This is the simplest page replacement algorithm.
A FIFO replacement algorithm associates each page with the time when that page was
brought into memory.
When a Page is to be replaced the oldest one is selected.
We replace the queue at the head of the queue.
When a page is brought into memory, we insert it at the tail of the queue.
In the following example, a reference string is given and there are 3 free
frames. There are 20 page requests, which results in 15 page faults.
Memory A B C D E F G H I J K L M N O P Q R S T
Belady’s Anomaly
Bélády's anomaly is the phenomenon in which increasing the number of page frames
results in an increase in the number of page faults for certain memory access
patterns.
This phenomenon is commonly experienced when using the first-in first-out (FIFO) page
Operating Systems
replacement algorithm.
For some page replacement algorithms, the page fault may increase as the number of
allocated frames increases. FIFO replacement algorithm may face this problem.
more frames ⇒ more page faults
Example: Consider the following references string with frames initially empty.
The first three references (7,0,1) cases page faults and are brought into the empty
frames.
The next reference 2 replaces page 7 because page 7 was brought in first. x Since 0 is the
next references and 0 is already in memory E has no page faults.
The next references 3 results in page 0 being replaced so that the next references to 0
causes page fault.
This will continue till the end of string. There are 15 faults altogether.
Optimal Algorithm
Optimal page replacement algorithm is mainly to solve the problem of Belady’s
Anomaly.
The Optimal page replacement algorithm has the lowest page fault rate of all algorithms.
An optimal page replacement algorithm exists and has been called OPT.
The working is simple “Replace the page that will not be used for the longest period
of time” Example: consider the following reference string
The first three references cause faults that fill the three empty frames.
The references to page 2 replace page 7, because 7 will not be used until reference 18.
The page 0 will be used at 5 and page 1 at 14.
With only 9 page faults, optimal replacement is much better than a FIFO, which had
15 faults.
This algorithm is difficult to implement because it requires future knowledge of reference
strings.
Replace page that will not be used for longest period of time.
Operating Systems
The main problem is how to implement LRU is the LRU requires additional h/w assistance.
Two implementations are possible:
1. Counters:
In this we associate each page table entry a time -of -use field and add to the CPU a
logical clock or counter.
The clock is incremented for each memory reference.
When a reference to a page is made, the contents of the clock register are copied to
the time-of-use field in the page table entry for that page.
In this way we have the time of last reference to each page we replace the page with
smallest time value. The time must also be maintained when page tables are changed.
2. Stack:
Another approach to implement LRU replacement is to keep a stack of page
numbers when a page is referenced it is removed from the stack and put on to the
top of stack.
In this way the top of stack is always the most recently used page and the bottom in
least recently used page.
Since the entries are removed from the stack it is best implement by a doubly linked
list. With a head and tail pointer.
Note: Neither optimal replacement nor LRU replacement suffers from Belady’s Anamoly. These
are called stack algorithms.
Operating Systems
Additional-Reference-Bits Algorithm
An 8-bit byte (reference bit) is stored for each page in a table in memory.
At regular intervals (say, every 100 milliseconds), a timer interrupt transfers control to
the operating system.
The operating system shifts the reference bit for each page into the high-order bit of its
8-bit byte, shifting the other bits right by 1 bit and discarding the low-order bit.
These 8-bit shift registers contain the history of page use for the last eight time periods.
If the shift register contains 00000000, then the page has not been used for eight time
periods.
A page with a history register value of 11000100 has been used more recently than one
with a value of 01110111.
Second- chance (clock) page replacement algorithm
The second chance algorithm is a FIFO replacement algorithm, except the reference bit
is used to give pages a second chance at staying in the page table.
When a page must be replaced, the page table is scanned in a FIFO (circular queue)
manner.
If a page is found with its reference bit as ‘0’, then that page is selected as the next
victim.
If the reference bit value is ‘1’, then the page is given a second chance and its
reference bit value is cleared (assigned as ‘0’).
Thus, a page that is given a second chance will not be replaced until all other pages have
been replaced (or given second chances). In addition, if a page is used often, then it sets
its reference bit again. This algorithm is also known as the clock algorithm.
Operating Systems
Allocation Algorithms
After loading of OS, there are two ways in which the allocation of frames can be done
to the processes.
Equal Allocation
If there are m frames available and n processes to share them, each process
gets m / n frames, and the leftovers are kept in a free-frame buffer pool.
Proportional Allocation
Allocate the frames proportionally depending on the size of the process.
If the size of process i is Si, and S is the sum of size of all processes in the
system, then the allocation for process Pi is ai= m * Si/ S. where m is the free
frames available in the system.
For Example: Consider a system with a 1KB frame size. If a small student process of 10 KB and
an interactive database of 127 KB are the only two processes running in a system with 62 free
frames. with proportional allocation, we would split 62 frames between two processes, as
Operating Systems
follows:
m=62,
S = (10+127) =137
Allocation for process 1 = 62 X 10/137 ~ 4
Allocation for process 2 = 62 X 127/137 ~57
Thus allocates 4 frames and 57 frames to student process and data base respectively.
Variations on proportional allocation could consider priority of process rather than just their size.
THRASHING
If the number of frames allocated to a low-priority process falls below the minimum
number required by the computer architecture, then we suspend the process execution.
A process is thrashing if it is spending more time in paging than executing.
If the processes do not have enough number of frames, it will quickly page fault. During
this it must replace some pages that are not currently in use. Consequently, it quickly
faults again and again.
The process continues to make a fault, replacing pages for which it then faults and
brings back. This high paging activity is called thrashing. The phenomenon of
excessively moving pages back and forth b/w memory and secondary has been called
thrashing.
Cause of Thrashing
Thrashing results in severe performance problems.
The operating system monitors the CPU utilization is low. We increase the degree of
multi programming by introducing new processes to the system.
A global page replacement algorithm replaces pages with no regard to the process to
which they The figure shows the thrashing
As the degree of multi programming increases, CPU utilization increases, although
more slowly until a maximum is reached.
Operating Systems
If the degree of multi programming is increased even further, thrashing sets in and the
CPU utilization drops sharply.
At this point, to increase CPU utilization and stop thrashing, we must decrease the
degree of multiprogramming.
We can limit the effect of thrashing by using a local replacement algorithm. To prevent
thrashing, we must provide a process with as many frames as it needs.
Locality of Reference:
As the process executes, it moves from locality to locality.
A locality is a set of pages that are actively used together.
A program may consist of several different localities, which may overlap.
Locality is caused by loops in code that find reference arrays and other data
structures by indices.
The ordered list of page numbers accessed by a program is called reference string.
Locality is of two types:
1. spatial locality
2. temporal locality
of active set is called the balance set. When a process is made active its
working set is loaded.
Some algorithms must be provided for moving processes into and out of the
balance set. As a working set is changed, corresponding change is made to the
balance set.
Working set presents thrashing by keeping the degree of multi programming
as high as possible. Thus, it optimizes the CPU utilization. The main
disadvantage of this is keeping track of the working set.
Page-Fault Frequency
When page- fault rate is too high, the process needs more frames and when it is too low,
the process may have too many frames.
The upper and lower bounds can be established on the page-fault rate. If the actual
page- fault rate exceeds the upper limit, allocate the process another frame or suspend the
process.
If the page-fault rate falls below the lower limit, remove a frame from the process. Thus,
we can directly measure and control the page-fault rate to prevent thrashing.
For the following page reference calculate the page faults that occur using FIFO and LRU
for 3 and 4 page frames respectively, 5,4,2,1,4,3,5,4,3,2,1,5.
Solution:
3 Frames (FIFO) ---------- 10-page faults
3 3 3 4 4 4 2 2
4 4 4 1 1 1 5 5 5
5 5 5 2 2 2 3 3 3 1
2 2 2 2 3 3 3 3
3 3 3 3 4 4 4 4 5
4 4 4 4 5 5 5 5 1 1
5 5 5 5 1 1 1 1 2 2 2