0% found this document useful (0 votes)
42 views

Unit-4 (Memory Management)

UNIT 4 AKTU NOTES CSE, BTECH

Uploaded by

Sseiepr
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Unit-4 (Memory Management)

UNIT 4 AKTU NOTES CSE, BTECH

Uploaded by

Sseiepr
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 103

Unit-4

Memory Management

Shyam B. Verma (Assistant


Professor)
Objectives
To provide a detailed description of various
ways of organizing memory hardware
To discuss various memory-management
techniques, including paging and
segmentation
To provide a detailed description of the Intel
Pentium, which supports both pure
segmentation and segmentation with paging
Background
Memory is the important part of the computer that is
used to store the data. Its management is critical to
the computer system because the amount of main
memory available in a computer system is limited. At
any time, many processes are competing for it.
 Moreover, to increase performance, several processes
are kept in memory simultaneously. So it is even more
important to manage them effectively.
Main memory and registers are only storage CPU can
access directly. Register access in one CPU clock (or
less) but main memory can take many cycles, causing
a stall.
Cache sits between main memory and CPU registers.
Protection of memory required to ensure correct
operation.
Role of Memory management
Memory manager is used to keep track of the status
of memory locations, whether it is free or allocated.
Memory manager permits computers with a small
amount of main memory to execute programs larger
than the size or amount of available memory. It does
this by moving information back and forth between
primary memory and secondary memory by using
the concept of swapping.
The memory manager is responsible for protecting
the memory allocated to each process from being
corrupted by another process. If this is not ensured,
then the system may exhibit unpredictable behavior.
Memory managers should enable sharing of memory
space between processes.
Basic Bare Machine
 It is a logic hardware in the computer system which
can execute the programs in the processor without
using the O/S.
In the early days, before the O/S were developed,
the instructions were executed directly on the
hardware without any interfering software. But the
only drawback was that the Bare Machine accepts
the program and instructions in Machine
Language.
Due to this, only the trained people who were able
to understand and instruct the computer in Machine
language were able to operate on such computers.
Due to this reason, the Bare Machine was termed as
inefficient and cumbersome after the development
of different Operating Systems.
Resident Monitor
The Resident Monitor is a code which runs
on Bare Machine. Its acts like an operating
system which controls everything inside a
processor and performs all the functions.
The Resident Monitor is thus also known as the
Job Sequencer because like the O/S, it also
sequences the jobs and sends it to the processor
for execution.
After the jobs are scheduled, the Resident Monitor
loads the Programs one by one into the main
memory according to their sequence. The
advantage of using a Resident Monitor over an
Operating System is that there is no gap or lag
between the program executions. So, the
processing is faster in the Resident Monitors.
Parts of Resident Monitor
Control Language Interpreter
 The job of the Control Language Interpreter is to read and
carry out the instructions line by line to the next level.
Loader
 The Loader is the main part of the Resident Monitor. As
the name suggests, it Loads all the required system and
application programs into the main memory.
Device Driver
 The Device Driver Takes care of all the Input-Output
devices connected with the system. So, all the
communication that takes place between the user and
the system is handled by the Device Driver. It simply acts
as an intermediate between the requests and the
response, requests that are made by the user to the
system, and they respond that the system produces to
fulfill these requests.
Contiguous Memory Management
Schemes
Each program occupies a single contiguous block
of storage locations, i.e., a set of memory locations
with consecutive addresses.
Single contiguous memory management
schemes:
It is the simplest memory management scheme
used in the earliest systems. Main memory is
divided into two contiguous areas or partitions.
The O/S reside permanently in lower portion and
the user process is loaded into the other partition.
Advantages:
 Simple to implement. Easy to manage and design.
 Once a process is loaded, it is given full processor's
time, and no other processor will interrupt it.
Disadvantagesof Single contiguous
memory management schemes:
Wastage of memory space due to unused
memory as the process is unlikely to use all the
available memory space.
The CPU remains idle, waiting for the disk to load
the binary image into the main memory.
It can not be executed if the program is too large
to fit the entire available main memory space.
It does not support multiprogramming, i.e., it
cannot handle multiple programs simultaneously.
Multiple Partitioning
The problem of inefficient CPU use can be overcome
using multiprogramming that allows more than one
program to run concurrently.
To switch between two processes, the operating
systems need to load both processes into the main
memory.
The operating system needs to divide the available
main memory into multiple parts to load multiple
processes into the main memory. Thus multiple
processes can reside in the main memory
simultaneously.
The multiple partitioning schemes can be of
two types:
Fixed Partitioning
Dynamic Partitioning
Fixed-Sized Partitioning
 Memory is divided into several fixed-sized partitions.
Each partition may contain exactly one process.
 So, the degree of multiprogramming is bounded by the
no. of partitions.
 The O/S keeps a table indicating free and occupied
partitions.
 When a process arrives it search for a partition large
enough to accommodate it.
 To search for a partition we can use three techniques:
 First fit
 Best fit
 Worst fit
 Disadvantages:
 Internal fragmentation: Free space available inside a partition.
How to satisfy a request of size n from a list of free holes?

First-fit: Search from the beginning,


allocate the first block that is big enough
Best-fit: Search every free block and
allocate the smallest block that is big
enough
Worst-fit: Allocate the largest block; must
also search entire list
Example: Allocate the following processes:
P1=357 KB, P2=210 KB, P3=468 KB,
P4=491
200 KB
400 KB 600 KB 500 KB 300 KB 250
KB KB

First-fit and best-fit better than worst-fit in terms of speed and storage
utilization
First-fit : P4 will not get space

200 KB 400 KB 600 KB 500 KB 300 KB 250 KB


P1 P2 P3

Best-fit: Internal fragmentation=43+40+32+109

200 KB 400 KB 600 KB 500 KB 300 KB 250 KB


P1 P4 P3 P2

Worst-fit: P3 and P4 will not get space

200 KB 400 KB 600 KB 500 KB 300 KB 250 KB


P1 P2
Variable-Sized Partitioning
Initially, all memory is available as one block.
When a process arrives we search for a block
and if found allocate only as much memory as
needed.
When a process terminates, it releases its
memory.
Disadvantage: External fragmentation
Example: Allocate following processes in
memory:
50 KB
P1-300KB,
150 KB
P2-25KB,
300 KB (Not
P3-125KB,
350 KB 600 KB
P4-
50KB.
(Not An instanceavailable)
(Available) of memory(Available)
is given below:
(Not
available available)
)
First-fit

50KB P2-25KB and 300KB P1-300KB 600KB


P3-125 KB and P4-
50KB

Best-fit: P4 will not get space

50KB P3-125 KB 300KB P1-300KB 600KB


and P2-
25KB

Worst-fit

50KB P2-25KB and 300KB P1-300KB 600KB


P3-125 KB and P4-
50KB
Fragmentation
External Fragmentation – total memory space
exists to satisfy a request, but it is not contiguous
Internal Fragmentation – allocated memory
may be slightly larger than requested memory;
this size difference is memory internal to a
partition, but not being used
Depending on the total memory and the average
process size, external fragmentation problem may
be minor or major. Statistical analysis of First fit
reveals that given N blocks allocated, another 0.5
N blocks will be lost due to fragmentation
 1/3 memory may be unusable. This property is called
50-percent rule
Fragmentation (Cont.)
Reduce external fragmentation by
compaction
Shuffle memory contents to place all free
memory together in one large block
Compaction is possible only if relocation is
dynamic, and is done at execution time
Another method is to move all processes
towards one end.
Base and Limit Registers
A pair of base and limit registers define the
logical address space
CPU must check every memory access
generated in user mode to be sure it is
between base and limit for that user
Hardware Address Protection
Address Binding
 A programs resides on a disk as a binary executable file, it
must be loaded into memory and placed within a process for
it to be executed. Depending on memory management
scheme, the process may be moved between disk and
memory during its execution.
 Addresses are represented in different ways at different
stages of a program’s life cycle:
 Source code addresses usually symbolic
 The compiler will bind these addresses to relocatable
addresses.
 i.e. “14 bytes from beginning of this module”
 Linker or loader will bind relocatable addresses to
absolute addresses
 i.e. 74014
 Bindings are done at:
 Compile time
 Load time
 Execution time
Logical vs. Physical Address
Space
 Logical address – address generated by the CPU;
also referred to as virtual address
 Physical address – address that are loaded into
the MAR or seen by the memory unit
Logical and physical addresses are the same in
compile-time and load-time address-binding
schemes; logical (virtual) and physical addresses
differ in execution-time address-binding scheme
Logical addresses generated by CPU are mapped
with the physical address. So, before execution
physical addresses are generated.
Logical address space is the set of all logical
addresses generated by a program
Physical address space is the set of all physical
addresses generated by a program
Memory-Management Unit
(MMU)
Hardware device that at run time maps virtual
to physical address
Many methods possible
To start, consider simple scheme where the
value in the relocation register is added to
every address generated by a user process at
the time it is sent to memory
 Base register now called relocation register
 MS-DOS on Intel 80x86 used 4 relocation
registers
The user program deals with logical addresses;
it never sees the real physical addresses
 Execution-time binding occurs when reference is
made to location in memory
Hardware Support for Relocation and Limit Registers
Dynamic Linking
Static linking – system libraries and program
code are combined by the loader into the binary
program image
Dynamic linking –linking postponed until execution
time
A small piece of code, called stub, used to locate
the appropriate memory-resident library routine
Stub replaces itself with the address of the routine,
and executes the routine
Operating system checks if routine is in processes’
memory address
 If not in address space, add to address space
Dynamic linking is particularly useful for libraries
System also known as shared libraries
Swapping
A process can be swapped temporarily out
of memory to a backing store, and then
brought back into memory for continued
execution. This is done by MTS. This
swapping of processes is done when:
A high priority process arrives and no
space in the ready queue.
The time quantum of a process expires
and no space in ready queue.
The waiting queue is full.
Backing store – fast disk large enough to
accommodate copies of all memory images
for all users
Roll out, roll in –lower-priority process is
swapped out so higher-priority process can
be loaded and executed
System maintains a ready queue of ready-to-run
processes which have memory images on disk
Whenever the scheduler wants to execute a
process, it calls the dispatcher.
The dispatcher checks if the process is present in
ready queue or not.
If not and no space available in memory, then
the dispatcher swap-out a process from memory
and swap-in the desired process and transfer the
control to the selected process.
Swap time is time taken to transfer the data
from memory to backing store or vice-versa.
For effective utilization of CPU, the execution time
must be greater than the total swap time.
Swapping (Cont.)
Does the swapped out process need to
swap back in to the same physical
addresses?
It depends on address binding methods.
If the binding is done at load-time then
the swapped out process will be kept in
the same memory space previously
allocated.
If the binding is done at execution time,
then a process can be swapped into
different memory locations.
Schematic View of Swapping

Process size=1 MB, Transfer rate=5 MB/sec, latency time=8


ms.
Swap time=process size/Transfer rate + Latency time=208
ms
Total swap time=swap-in time + swap-out time
User’s View of a Program
 A program is a collection of
segments. A segment is a
logical unit such as:
• main program
• procedure
• Function
• Method
• Object
• local variables, global
variables
• common block
• Stack
• symbol table
• arrays
Logical View of Segmentation

4
1

3 2
4

user space physical memory space


Segmentation
Segmentation divides processes into smaller
subparts known as modules. The divided segments
need not be placed in contiguous memory. It
supports user view of memory. User prefer to view
memory as a collection of variable sized segment.
Since there is no contiguous memory allocation,
internal fragmentation does not take place. The
length of the segments of the program and memory
is decided by the purpose of the segment in the user
program.
The user’s view is mapped onto physical memory. A
logical address space is collection of segments and
each segment has a number and a length.
The address specifies both the segment number and
the offset within a segment.
So, a logical address consists of <Segment-
number, Offset>
Segmentation Hardware
Segmentation Architecture
Logical address consists of a two tuple:
<segment-number, offset>
Segment table – stores the base address of a
segment and the limit.
 base – contains the starting physical address
where the segments reside in memory
 limit – specifies the length of the segment

Segment-table base register (STBR) points


to the segment table’s location in memory
Segment-table length register (STLR)
indicates number of segments used by a
program;
segment number s is legal if s <
STLR
Segmentation Architecture
(Cont.)
Protection & Sharing
 With each entry in segment table associate:
 validation bit = 0  illegal segment
 read/write/execute privileges

Two different processes may share the same


segment. Protection bits associated with
segments; code sharing occurs at segment level.
For all processes, a shared segment will have the
same segment number and only one copy in
physical memory.
Segments local to a process are not sharable.
Since segments vary in length, memory
allocation is a dynamic storage-allocation
problem
Disadvantages of
Segmentation
As we know that segments are of variable length, it
may cause external fragmentation.
In this case process may wait until a large block of
memory become available or compaction create a
large block.
Because segmentation is dynamic by nature, we can
compact memory whenever we need it.
The problem of fragmentation depends on the
average segment size:
 If the average segment size is small, external
fragmentation will also be small.
 Solution?
 Fixed-sized small segment: Paging
Paging
 Paging is Memory management scheme that permits the
physical-address space of a process to be non-contiguous.
 Memory is broken into fixed-sized blocks called frames.
 Size is power of 2, between 512 bytes and 16 Mbytes
 Divide logical memory into blocks of same size called pages
 Process is allocated physical memory whenever the frames
are available
 Avoids external fragmentation
 Avoids problem of varying sized memory chunks
 Keep track of all free frames
 To run a program of size N pages, need to find N free frames
and load program
 Set up a page table to translate logical to physical
addresses
 Still have Internal fragmentation
Paging Hardware
Address Translation Scheme
Address generated by CPU is divided into:
Page number (p) – used as an index into a
page table which contains base address of
each page in physical memory
Page offset (d) – combined with base address
to define the physical memory address that is
sent to the memory unit
Page size is equal to frame size.
For given logical address space 2m and page
size 2n addressing unit (bytes or words). Then
page number page offset
no. of bits for page no.=m-n (higher bits) and
p d
page offset=n (lower bits)
m -n n
Example
 Page size=4 bytes, physical memory=32 bytes,
logical address space 24
 The CPU will generate a logical address of 4 bits. 2
bits to locate a page and 2 bits to locate a words
within a page.
 No. of frames=32/4 = 8 (3 bits are required to
represent frames)
Paging Example

n=2 and m=4 32-byte memory and 4-byte pages


Paging (Cont.)
 Calculating internal fragmentation
 Page size = 2,048 bytes
 Process size = 72,766 bytes
 35 pages + 1,086 bytes
 Internal fragmentation of 2,048 - 1,086 = 962 bytes
 Worst case fragmentation = 1 frame – 1 byte
 On average fragmentation = 1/2 frame size
 So small frame sizes desirable, but involves overhead in
page table entry.
 But each page table entry takes memory to track
 Page sizes growing over time
 Solaris supports two page sizes – 8 KB and 4 MB

 Process view and physical memory now very different


 By implementation process can only access its own
memory
Free Frames

Before allocation After allocation


Implementation of Page Table
 Each O/S has its own method for storing page table. Most
allocate a page table for each process. A pointer to the
page table is stored in PCB of a process. Uses two
registers.
 Page table is kept in main memory and a Page-table
base register (PTBR) points to the page table. Changing
page table during context switch requires only changing
PTBR.
 Page-table length register (PTLR) indicates size of the
page table
 In this scheme every data/instruction access requires two memory
accesses. One for the page table and one for the data / instruction.
 The two memory access problem can be solved by the
use of a special fast-lookup hardware cache called
associative memory or translation look-aside buffers
(TLBs). When an item is searched all keys are compared
simultaneously. Hardware is expensive but fast.
Implementation of Page Table
(Cont.)
The TLB contains entries of pages in main-memory.
TLB store the <page-no. , frame no. > entry.
When a logical address is generated, its page no. is
searched in TLB. In case of TLB hit, corresponding
frame no. loaded and in case of a miss, search the
page no. in the page table, make an entry in TLB
also.
If the TLB is full the O/S will replace an entry
(depending on the policy) with new one.
Some TLBs store address-space identifiers
(ASIDs) in each TLB entry – uniquely identifies each
process to provide address-space protection for that
process
 Otherwise need to flush at every context switch
TLBs typically small (64 to 1,024 entries)
Paging Hardware With TLB
Associative Memory

Associative memory – parallel search


P a ge # F ra m e #

Address translation (p, d)


 If p is in associative register, get frame # out
 Otherwise get frame # from page table in
memory
Effective Access Time
 Associative Lookup =  time unit
 Can be < 10% of memory access time
 Hit ratio = 
 Hit ratio – percentage of times that a page number is
found in the TLB; ratio related to number of associative
registers
 Consider  = 80%,  = 20ns for TLB search, 100ns
for memory access time (MAT)
 Effective Access Time (EAT)
EAT = (1*MAT + )  + (2*MAT + )(1 –
)
Consider  = 80%,  = 20ns for TLB search, 100ns
for memory access
 EAT = 0.80 x (100 + 20) + 0.20 x (2*100 + 20) =
140ns
 Consider more realistic hit ratio ->
 = 98%,  =
20ns for TLB search, 100ns for memory access
 EAT = 0.98 x (100 + 20) + 0.02 x (200 + 20) = 101ns
Memory Protection
Memory protection implemented by associating
protection bit with each frame to indicate if
read-only or read-write access is allowed
 Can also add more bits to indicate page execute-
only, and so on
Valid-invalid bit attached to each entry in the
page table:
 “valid” indicates that the associated page is in
the process’ logical address space, and is thus a
legal page
 “invalid” indicates that the page is not in the
process’ logical address space
 Or use page-table length register (PTLR)
Any violations result in a trap to the kernel
Valid (v) or Invalid (i) Bit In A Page
Table
Shared Pages
Shared code
One copy of read-only (reentrant) code
shared among processes (i.e., text editors,
compilers, window systems)
Similar to multiple threads sharing the same
process space
Also useful for interprocess communication if
sharing of read-write pages is allowed
Private code and data
Each process keeps a separate copy of the
code and data
The pages for the private code and data can
appear anywhere in the logical address space
Shared Pages Example
Structure of the Page Table
Structuring of the page table is important for
efficient use of paging.
 Consider a 32-bit logical address space as on
modern computers
 Page size of 4 KB (212)
 No. of pages= 220
 Page table would have 1 million entries (2 32 / 212)
 4 MB of physical address space. No. of frames= 2 10
 If each entry in page table is of size 4 bytes then,
memory for page table alone is of size 4MB. This
page table will also be divided into pages and stored
in memory. There are different ways to store page
table in memory:
Hierarchical Paging
Hashed Page Tables
Inverted Page Tables
Hierarchical Page Tables

Break up the logical address space into


multiple page tables
A simple technique is a two-level page table
In this method the page table itself is also
paged. As per the previous example, the size of
page table=4MB
P1=10 bits P2=10 bits D=12 bits
No. of pages required to store page
table=4MB/4KB
CPU (32 P (20 D (12
bits)
=210 bits) bits)
Logical
Address
P1 (10 P2 (10
bits) bits)
Two-Level Page-Table Scheme
Two-Level Paging Example
 A logical address (on 32-bit machine with 1K page size)
is divided into:
 a page number consisting of 22 bits
 a page offset consisting of 10 bits

 Since the page table is paged, the page number is


further divided into:
 a 12-bit page number
 a 10-bit page offset

 Thus, a logical address is as follows:

 where p is an index into the outer page table, and p is


1 2
the displacement within the page of the inner page table
 Known as forward-mapped page table
Address-Translation Scheme
64-bit Logical Address Space
Even two-level paging scheme not sufficient
If page size is 4 KB (212)
 Then page table has 252 entries
 If two level scheme, inner page tables could be 210 4-
byte entries
 Address would look like

 Outer page table has 242 entries or 244 bytes


 One solution is to add a 2nd outer page table
 But in the following example the 2nd outer page table is
still 234 bytes in size
 And possibly 4 memory access to get to one physical memory
location
Three-level Paging Scheme
Hashed Page Tables
If the system uses address spaces > 32 bits
The virtual page number is hashed into a
page table
This page table contains a chain of elements
hashing to the same location
Each element contains:
the virtual page number
the value of the mapped page frame
a pointer to the next element
Virtual page numbers are compared in this
chain searching for a match
If a match is found, the corresponding physical
frame is extracted
Hashed Page Table
Inverted Page Table
Rather than each process having a page table
and keeping track of all possible logical pages.
In the page table there is one entry for each
frame i.e. the size of the page table contains
entry for each frame.
The logical address consists of <process-id,
page-no., offset>
On search it return i (address of ith frame)
It decreases memory needed to store each page
table, but increases time needed to search the
table when a page reference occurs
Use hash table to limit the search to one — or at
most a few — page-table entries
 TLB can accelerate access
Inverted Page Table Architecture
Chapter 9: Virtual Memory
Chapter 9: Virtual Memory
Background
Demand Paging
Copy-on-Write
Page Replacement
Allocation of Frames
Thrashing
Memory-Mapped Files
Allocating Kernel Memory
Other Considerations
Operating-System Examples
Objectives
To describe the benefits of a virtual memory
system
To explain the concepts of demand paging,
page-replacement algorithms, and allocation
of page frames
To discuss the principle of the working-set
model
To examine the relationship between shared
memory and memory-mapped files
To explore how kernel memory is managed
Background
Code needs to be in memory to execute, but
entire program rarely used
 Error code, unusual routines, large data structures
Entire program code not needed at same time
Consider ability to execute partially-loaded
program
 Program no longer constrained by limits of
physical memory
 Each program takes less memory while running ->
more programs run at the same time
 Increased CPU utilization and throughput with no increase
in response time or turnaround time
 Less I/O needed to load or swap programs into
memory -> each user program runs faster
Background (Cont.)
Virtual memory is a technique that allow the
execution of processes that may not be
completely in memory.
Virtual memory – separation of user logical
memory from physical memory
 Only part of the program needs to be in memory
for execution
 Logical address space can therefore be much
larger than physical address space
 Allows address spaces to be shared by several
processes
 Allows for more efficient process creation
 More programs running concurrently
 Less I/O needed to load or swap processes
Background (Cont.)
Virtual address space – logical view of
how process is stored in memory
Usually start at address 0, contiguous
addresses until end of space
Meanwhile, physical memory organized in
page frames
MMU must map logical address to physical
address
Virtual memory can be implemented via:
Demand paging
Demand segmentation
Virtual Memory That is Larger Than Physical
Memory
Demand Paging
Similar to paging system with swapping
process.
We could bring entire process into memory
at load time Or bring a page into
memory only when it is needed.
Lazy swapper – who never swaps a page
into memory unless that page will be
needed by process.
A Swapper that deals with pages is called
a pager
Less I/O needed, no unnecessary I/O
Page is needed  reference to it
invalid reference  abort
not-in-memory  bring to memory
Basic Concepts
 When a process is to be swapped in, the pager
guesses which pages will be used before the
process is swapped out again. The pager brings
in only those pages into memory.
 How to determine those set of pages?
 Need new MMU functionality to implement demand
paging
Valid-Invalid Bit
With each page table entry a valid–invalid bit is
associated
(v  in-memory – memory resident, i  not-
in-memory)
Initially valid–invalid bit is set to i on all entries
Example of a page table snapshot:

During MMU address translation, if valid–


invalid bit in page table entry is i  page fault
Page Table When Some Pages Are Not in Main Memory
Page Fault
If a reference to a page is made, if it is a first
reference to that page, it will trap to the
operating system: a page fault occurs
1.The procedure for handling page fault will
look at the page reference made. If it is:
 an invalid reference  abort the process
 not in memory
a. Find a free frame in physical memory
b. Swap page into frame via scheduled disk
operation
c. Reset tables to indicate page now in memory,
Set validation bit = v
2.Restart the instruction that caused the page
fault
Steps in Handling a Page Fault
Aspects of Demand Paging
 In extreme case – start process with no pages in
memory
 The O/S will bring the first page in memory and start
executing.
 Page faults may occur until all pages needed by the process
is in memory.
 This scheme is called Pure demand paging
 Locality of reference
 To reduce the page faults, execute process only when
required pages are loaded into memory.
 To find required pages needed, a locality model used.
 It states that when a process executes it moves from locality
to locality.
 A locality is a set of pages that are actively used together.
 Hardware support needed for demand paging
 Page table with valid / invalid bit
 Secondary memory (swap device with swap space)
 Instruction restart
Instruction Restart
It is very important to restart the instruction
after the page fault.
The page fault can occur at any stage.
If it occur at instruction fetch then, restart the
instruction fetch after the page fault.
 If it occur at operand fetch then, fetch and
decode the instruction again and then fetch
the operand again.
Paging
Stages in Demand Paging (worse case)
1. Trap to the operating system
2. Save the user registers and process state
3. Determine that the interrupt was a page fault
4. Check that the page reference was legal and determine the location of
the page on the disk
5. Issue a read from the disk to a free frame:
1. Wait in a queue for this device until the read request is serviced
2. Wait for the device seek and/or latency time
3. Begin the transfer of the page to a free frame
6. While waiting, allocate the CPU to some other user
7. Receive an interrupt from the disk I/O subsystem (I/O completed)
8. Save the registers and process state for the other user
9. Determine that the interrupt was from the disk
10.Correct the page table and other tables to show page is now in memory
11.Wait for the CPU to be allocated to this process again
12.Restore the user registers, process state, and new page table, and then
resume the interrupted instruction
Paging
Three major activities
Service the interrupt – careful coding means just
several hundred instructions needed
Read the page – lots of time
Restart the process – again just a small amount of
time
Page Fault Rate 0  p  1
if p = 0 no page faults
if p = 1, every reference is a fault
Effective Access Time (EAT)
EAT = (1 – p) x memory access
+ p (page fault overhead
+ swap page out
+ swap page in )
Demand Paging Example
 Memory access time = 200 nanoseconds
 Average page-fault service time = 8 milliseconds
 EAT = (1 – p) * 200 + p (8 milliseconds)
= (1 – p) * 200 + p * 8,000,000
= 200 + p * 7,999,800
 If one access out of 1,000 causes a page fault, then
EAT = 8.2 microseconds.
This is a slowdown by a factor of 40!!
 If want performance degradation < 10 percent
 220 > 200 + 7,999,800 x p
20 > 7,999,800 x p
 p < .0000025
 < one page fault in every 400,000 memory accesses
What Happens if There is no Free
Frame?

If all the space of physical memory is used up by


process pages.
In this case the O/S has several options:
 It could terminate the user process.
 The O/S could swap out a process by freeing all its
frames. It will reduce the degree of
multiprogramming.
 Replace an old page by a new page. This process is
called Page Replacement.
Page replacement – find some page in memory, but
not really in use, page it out.
 Performance – want an algorithm which will result in
minimum number of page faults
Same page may be brought into memory several
times
Page Replacement
Prevent over-allocation of memory by
modifying page-fault service routine to include
page replacement
Use modify (dirty) bit to reduce overhead of
page transfers – only modified pages are
written to disk
Page replacement completes separation
between logical memory and physical memory
– large virtual memory can be provided on a
smaller physical memory
Need For Page Replacement
Basic Page Replacement
1. Find the location of the desired page on disk
2. Find a free frame:
- If there is a free frame, use it
- If there is no free frame, use a page
replacement algorithm to select a victim frame
- Write victim frame to disk if dirty
3. Bring the desired page into the (newly) free
frame; update the page and frame tables
4. Continue the process by restarting the instruction
that caused the trap

Note now potentially 2 page transfers for page fault


– increasing EAT
Page Replacement
Page and Frame Replacement
Algorithms
 Frame-allocation algorithm determines
 How many frames to be given to each process
 Which frames to replace
 Page-replacement algorithm
 Want lowest page-fault rate on both first access and re-access
 Evaluate algorithm by running it on a particular string
of memory references (reference string) and
computing the number of page faults on that string
 String is just page numbers, not full addresses
 Repeated access to the same page does not cause a page fault
 Results depend on number of frames available
 In all our examples, the reference string of
referenced page numbers is

7,0,1,2,0,3,0,4,2,3,0,3,0,3,2,1,2,0,1,7,0,1
Graph of Page Faults Versus The Number of
Frames
Algorithm
Reference string:
7,0,1,2,0,3,0,4,2,3,0,3,0,3,2,1,2,0,1,7,0,1
3 frames (3 pages can be in memory at a time per
process)

15 page faults

Can vary by reference string: consider


1,2,3,4,1,2,5,1,2,3,4,5
Adding more frames can cause more page faults!
 Belady’s Anomaly: the page fault rate may increase as the no.
of allocated frames increases.
How to track ages of pages?
Just use a FIFO queue
Anomaly
Algorithm
 Replace page that will not be used for longest
period of time
 This algorithm guarantees the lowest page fault rate for a
mixed no. of frames.
 Never suffers from Belady’s anomaly.
 There are 9 page-faults and this is optimal for the example
 How do you know this which page is not required in
future?
 Can’t read the future
 Used for measuring and comparing with other
algorithms to show how well your algorithm
performs
Least Recently Used (LRU)
Algorithm

 Use past knowledge rather than future


 Replace page that has not been used in the most
amount of time
 Associate time of last use with each page

 12 faults – better than FIFO but worse than OPT


 Generally good algorithm and frequently used
 But how to implement?
LRU Algorithm (Cont.)
 Counter implementation
 Every page entry has a counter; every time page is
referenced through this entry, copy the clock into the
counter
 When a page needs to be changed, look at the counters
to find smallest value
 Search through table needed

 Stack implementation
 Keep a stack of page numbers in a double link form:
 Page referenced:
 move it to the top
 requires 6 pointers to be changed
 But each update more expensive
 No search for replacement
 LRU and OPT are cases of stack algorithms
that don’t have Belady’s Anomaly
Use Of A Stack to Record Most Recent Page References
Thrashing
If a process does not have “enough” pages,
the page-fault rate is very high
Page fault to get page
Replace existing frame
But quickly need replaced frame back
This leads to:
 Low CPU utilization
 Operating system thinking that it needs to increase
the degree of multiprogramming
 Another process added to the system

Thrashing  a process is busy swapping


pages in and out
Thrashing (Cont.)
Working-Set Model
 It is based on assumption of locality.
   working-set window. It examine the most
recent  page references and put them into the
set.
 If a page is in use, it will be in the working set. If it
is not in use, drop it from working set.  time units
after its last use (reference). Thus, WS is an
approximation of program’s locality.
WSSi (working set of Process Pi) =
total number of pages referenced in the most recent  (varies in
time)
if  too small will not encompass entire locality
if  too large will encompass several localities
if  =   will encompass entire program
Total demand for frames D = WSSi Approximation of
locality
if D > m  Thrashing
Policy if D > m, then suspend or swap out one of
the processes
Set
Approximate with interval timer + a reference
bit
Example:  = 10,000
Timer interrupts after every 5000 time units
Keep in memory 2 bits for each page
Whenever a timer interrupts copy and sets the
values of all reference bits to 0
If one of the bits in memory = 1  page in
working set
Why is this not completely accurate?
Improvement = 10 bits and interrupt every
1000 time units
Page-Fault Frequency
More direct approach than WSS
Establish “acceptable” page-fault frequency
(PFF) rate and use local replacement policy
If actual rate too low, process loses frame
If actual rate too high, process gains frame
Rates
 Direct relationship between working set of a process
and its page-fault rate
 Working set changes over time
 Peaks and valleys over time
End of Chapter 9

You might also like