Mem Man
Mem Man
Memory Management
Vijay Kumar
Memory_Managemen t
Memory
1. 2. 3. 4.
Registers (all computations take place here) Cache (all frequently accessed data may reside here) Primary Memory Mass Memory (All archiving is done here).
Physical Characteristics of Primary Memory: 1. A linear list of words (Size is defined by the Hardware ). 2. Each word is associated with a unique address (absolute).
Physical memory
Random access possible. Reading time - fixed (1 cycle: 80 ns) Writing time - fixed (1 and a half cycle: 120 ns) Address type - physical (Absolute address). Memory size can be expanded dynamically.
Data movement between Disk Cache. Primary memory disk. Registers chache. Registers primary memory Levels of real memory
Registers
Cache
Primary memory
Mass memory
Machine with no resident monitor 1. The entire memory belongs to one program. 2. The operator loads the program and provides commands to execute the program via some keys.
Memory_Managemen t 3. 4. 5. At the end of the execution the operator uses another set of keys to get the output. Outputs are distributed manually. Such system has limitations and cannot satisfy today's processing requirements.
Machine with resident monitor: The operator's functions are stored in the system as resident monitor. To do this the entire memory is divided into monitor part and user part. The monitor is loaded permanently into its portion of the memory. A program is loaded into the primary memory for execution. Requirement: The monitor must be protected from the user program. Solution: The only way this can be done is by precisely defining the higher boundary (end address) of the monitor portion of memory. A fixed memory location (fence word) is used to hold this address. Any memory access is checked against the contents of the fence word and access is granted only if the desired address is larger. Limitations a. b. c. d. e. Monitor cannot grow and shrink. A fence word must be chosen carefully. User program must be loaded at a fixed location. If program size > memory then program cannot be executed. Program must be recompiled whenever any change is made.
Improvement: Use a fence register instead of a fence word. The content of fence register can be changed anytime. System initialization: Load required value in the fence register.
32K
Memory management algorithm: CPU generates an address (say paddr) if (paddr >= fence) and (paddr < memory limit) then access the location else begin generate a trap for memory violation; terminate the program; Start another program end;
Limitation 1. 2. a program must be loaded at the compiled addresses. If the monitor grows and shrinks then program must be recompiled before execution.
Relocation: Facility to relocate the program anywhere in the memory. Requirements of relocation
Memory_Managemen t 1. 2. 3. The addresses assigned to the program by the compiler should be independent to physical addresses. It should compile every program address starting from 0. This means the compiler should generate logical addresses. Program addresses should be relative to a fixed location. The compiler should not load the program in the memory.
Implementation 1. Define a hardware register (relocation/base register). This will achieve the aim no. 2 above. 2. Convert the logical addresses generated by the CPU, using relocation register, to get the physical address of the required location. The CPU then accesses the contents of this location and continue with the program.
Memory
CPU
Logical Address 45
1445
1485
Dynamic relocation using a relocation register Observation Logical address: program instruction address Physical address: Real address of the program in memory. Mapping from logical to physical: by the base/relocation register Swapping: Consider the following situation Job 1 is very large (50K). Job 2 is a small job (5K) Suppose Job 1 is in the memory and is under execution. Job 2 has to wait for Job 1 to finish. What if Job 2 is a very high priority job? Result: job 2 must wait. Improvement Swap the Job 1 out of the memory briefly. Schedule job 2 and let it finish then schedule Job 1. Implementation: This scheme, under the resident monitor can be implemented by saving the current state of Job 1 on the disk and Job 2 can be loaded for execution. Job 1 can be reloaded once Job 2 is finished. There is some improvement, but we still could not run more than one job at a time. Analysis: The main activity which affects the response time of a job is swapping activity. Suppose Latency time - 8 milliseconds. Program size - 20K. Transfer rate - 250,000 words/second = 250 words/ms. Transfer time (20K) - 20,000/250 = 80 ms Total transfer time 8 (latency time) + 80 = 88 ms = 0.088 secs. Number of Swaps: Minimum number of movement between the memory and the disk is 2. In its lifetime a job may be swapped more than 2 and may have a large swapping time. Some efficiency can be gained by
Memory_Managemen t making the CPU time for a process longer than the swap time. In this way the CPU will not remain idle for a longer period of time. Under this scenario the CPU time should be larger than 0.176 seconds (0.088 x 2). How much to swap: Another improvement can be achieved by computing the exact swap amount, i.e., the size of the program in the memory. It is not necessary to swap the entire user area, since the active program may not be as large as 20/30 K (the size of user memory). Also the swap time can be reduced by improving the disk hardware. Overlapped Swapping: What about dividing the total running activity of a job between IO and CPU? In this way IO is done by the IO processor and execution is done by the CPU. It seems all right. This means the swapping activity would be done by the IO processor and the execution by the CPU. Good but what the CPU will do when IO is doing the swapping? There is no job in the memory during this time. So CPU must be given a job, while IO is swapping.
Swap out
Swap in
Bottlenecks Program movement in the memory takes time. While a program is being swapped, the CPU is idle.
Improvements Allow swapping in and program execution go simultaneously. It should not be necessary to move job from a buffer to the user area for execution. The job can execute from anywhere in the memory.
Solution Make fence register mobile. This means, move the fence register to the beginning address of the job (job entry point) and scheduled the job for execution. No buffer to user area movement.
Memory_Managemen t
Swap out
Swap in
Increase the number of partitions. Convert buffers into user area where program can be loaded for execution. Avoid program movement inside the memory. Extra hardware support for keeping track of these partitions.
Partitioning the memory: Each partition must be protected by being overwritten by programs in the neighboring partition. Two registers: Base register would store the beginning address of the process and the limit register would store the end address of the process. Only two registers are needed since at any time only one program can be executing. Example: Limit Register (LR) = 100. Program address <= 100. Base Register (BR) = 50. Program address => 50.
Base
Limit
Memory
CPU
Logical address
<= N
Addressing error
When a program is loaded into a partition for execution then every access to the memory is checked. Suppose address generated by the CPU = LOGICALADDRESS. if LOGICALADDRESS => (BR) and LOGICALADDRESS< = (LR) then access the location else generate a trap: Memory Violation.
Memory_Managemen t Analysis of Multiple Partitions Partitions size: variable. No. of partitions: fixed. Arrangement: partitions are contiguous. Name of the scheme: Multiple Contiguous Fixed Partition Allocation (MFT). The commercial (IBM) name: Multiprogramming with a Fixed number of Tasks (MFT). A typical partition set Total Memory Size 16 Partitions 4 Partitions 1 Partition 256K. Resident Monitor - 64K. 80K (5K each. Very small jobs). 60K (15K each. Average jobs). 52K (Very large Jobs).
These partition sizes may depend on the environment under which the system is going to function.
It selects the first partition > = than the job size. This implies that it may select a partition which is much larger than the program and the next larger program may be denied a place in the memory. Best Fit Allocation: Search for the most suitable partition for the job. The best choice is when partition size = job size. This may happen seldom, so the next choice is the smallest available partition but large enough to accommodate the job. NOTE: If the memory partitions are sorted in ascending order according to their size then the first fit = the best fit. Worst Fit Allocation: Allocate the largest size partition to a job. NOTE to students: Study the advantages and disadvantages of these three schemes.
Memory_Managemen t
Monitor 2K 1K 2K 3K 4K 4K Q2 Q6 2K 6K
11K
7K
Q8
12K
MFT with separate queues for each partition One Queue for all the partitions: One queue serves all the partitions.
3K 4K 7K 11K 3K 1K 4K
Monitor 2K 6K
12K
MFT with one queue Problems with MFT 1. 2. During execution a process may demand more memory. If the partition in which the job is loaded has some spare memory then the job's demand can be met otherwise the job is terminated with a message : "Insufficient Memory". A process is swapped out and when a larger partition becomes available then the job can be swapped in into a larger area. Not so easy to do. It needs some hardware support.
Memory fragmentation: Portion of a partition that cannot be allocated to other process. Internal fragmentation: Fragmentation inside a partition. External fragmentation: Fragmentation outside a partition. Criteria of a best partition set 1. 2. Minimum fragmentation (external or internal). Most processes can execute.
Memory_Managemen t
0 M 40K
Storage allocation: Best-Fit or First-Fit. External fragmentation: Can be frequent. Internal fragmentation: None, since there are no pre-partition. Reason: Because jobs release their memory dynamically and it may not be sufficient for the next job on the job queue. This implies that there might be many small unallocated memory pieces (holes) scattered all over the memory.
Memory_Managemen t
M J1 J2 J3 J4
M J1 J2 J4 J3
M J1 J2
900K
900K
1500K 1900K J4 J3
2100K
c. moved 400K
2100K
d. moved 200K
Solution: Compaction. Merge these pieces and relocate the existing jobs in the memory. Algorithm 1. Check the memory status table to find the location (beginning and end addresses) of free memory pieces. 2. Join all the adjacent pieces together and keep track of the largest number of adjacent pieces glued together. 3. Join other pieces to this large piece. 4. Relocate all the affected programs. Simple method: move all jobs towards one end of the memory and all free partitions to the other end, figure b. Total number of words moved: 600K. Second method: move job 4 above job 3, figure c. Total number of words moved: 400K. Third method: move job 3 down below job 4, figure d. Total number of words moved: 200K. Machines used memory compaction: SCOPE O/S of CDC 6600, PDP-10, Univac 1108. They used base and limit registers to implement memory compaction. Compaction Problems 1. The relocation of programs is expensive. In a heavily loaded system program relocation may take up too much CPU time. 2. Frequent relocation may create addressing errors. Limitations of these memory management mechanisms 1. All programs must be loaded into contiguous memory locations. 2. All programs must have its own copy of system routine or procedure. This means that several copies of the same code may reside in the memory, i.e., no program sharing. This limitation, however, may be eliminated by swapping but it would be expensive to do so because swapping takes time and resource.
10
Memory_Managemen t
Improvements
Paging: The mechanism is based on the idea that at a time only one instruction of a program is required by the CPU. Furthermore, physical contiguity of program instructions are not necessary but they must have the logical contiguity. For example a JUMP or GOTO instruction breaks the physical contiguity of these instruction for correct execution of a program. Since a program is concerned only about the logical contiguity, the memory management is free to break the physical contiguity. A logical contiguity, usually, is provided by pointers. Linking instructions of a program via pointers to maintain logical contiguity is impractical. Paging mechanism establishes the logical contiguity by restructuring the physical address space and the logical address space. Physical address space: The entire physical memory. Logical address space: All program referenced addresses.
Page table
Instruction access requires: starting address of the page holding the instruction and the displacement (offset) from the start of this page. n m Words of a 2 word size page can be accessed by n bits. Similarly if there are 2 total number of blocks in the memory then we would need m bits to number all these blocks. A page table of a program holds: the page number and the physical starting address of that page. The page number is used to index the page table which gives the physical address of that page. This is nothing
11
Memory_Managemen t but the base address of that page in the memory. Into this address the page offset is added to go to the desired instruction. Example: Consider Page size = 4 words. Physical memory size = 32 words = 8 frames.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
a b c d e f g h i j k l m n o p
page 0
0 p r o g r a m p a g e s 1 2 3
5 6 1 2
0 frame 0 4 i j k l m n o p
page 1
Page table 8
frame 1
page 2
frame 2 M e m o r y F r a m e s
12
page 3
frame 3 16 frame 4 20 a b c d e f g h
frame 5
24
frame 6
28
frame 7
Paging hardware: p = page number. d = offset in the page. f = frame number where page is loaded.
12
Memory_Managemen t
main memory
CPU
logical address
physical address
p f Page table
Implementation of Page Table: One register for one page table entry. Not economical for large page tables. Since page tables of ready processes always reside in the main memory, a Page Table Base Register (PTBR) points to the base address of running process's page table. In this way each program can maintain its separate page table and switching of page table can be done by changing the contents of PTBR. Associative registers: accessed in parallel. Set of very fast, content addressable registers (look-aside memory) that can be
Improvement: A program requires only a part of its entire page table to get the address of the desired block. Put only a few entries of a set of page tables in the associative registers. Instruction access 1. Search the set of associative registers. 2. If page number found then proceed to get the memory address otherwise get the entire page table of that process and get the frame number from the page table. Memory access time under paging Hit ratio: percentage of time a page is found in the associative registers. Hit ratio increases with the number of associative registers. With 8 or 16 associative registers, a hit ratio of 80 to 90% can be obtained. An 80% hit ratio means that 80% of the time we find the desired page number in the associative registers. Some useful data Time taken to search associative registers: 50 ns. Time taken to access the memory: 750 ns. Total access time: 50 + 750 = 800 ns. when the page number is in the associative register. Page number is not in the associative registers Time to access the page table = 750 ns. Time to access the desired word in the page table = 750 ns. Total time = 1500 ns. Effective access time = .80 * 800 + .20 * 1550 = 950 ns. Access slows down from 750 ns to 950 ns. = 26.6% For 90% hit ratio = .90 * 800 + .10 * 1550 = 875 ns.
13
Page Sharing
Program and data sharing is possible under paging. Sharing avoids providing a personal copy of a page to each programs. This is required when re-entrant (pure procedure) codes are used by several programs. Reentrant codes are those codes which are not modified during their execution, i.e., there is no store operation in the code (no self address modification). Editor, compilers, loaders etc., are some of such codes. No sharing example No. of users: 40. Text editor size: 30K. Data space: 5K. Each user must have a copy of 30K of editor since they cannot share other user's copy of the editor. Total memory required to support these users: = 40 35 = 1400K. Sharing example
3 4 6 1
0 1 2 3
3 4 6 7
0 1 2 3
3 4 6 7
0 1 2 3
Protection: To protect illegal access by a program. Consider the following situation: Address Space: 14 bits. (address from 0 to 16,383). Page size: 2K (2048 words) Program size: 0 to 10,468 (pages from 0 to 5). Total number of pages that can be addressed: 16,383/2048 = 8. This means pages 0, 1, 2, 3, 4 and 5 are legal and pages 6 and 7 are illegal for this program. So if the program tries to access pages 6 or 7 the system must trap it. To recognize the illegal page request invalid bits are provided in the page table. The page table for the above system then looks like:
14
Memory_Managemen t
Program space 00000 Page 0 Page 1 Page 2 Page 3 Page 4 10,468 12,287 Page 5 0 1 2 3 4 5 6 7
Segmentation
User's view of memory: the memory where his program is loaded is modularized as his program, i.e., segmented. The memory management technique that implements this view is called segmentation.
Stack SQRT
Symbol table
Implementation Segment is referred to by a segment number. During compilation the compiler constructs the segments.
A Pascal compiler might create separate segments for the global variables, local variables of each procedure, procedure calls, stack to store parameters and link address, etc. The loader assigns segment numbers to these segments. To map the address generated by the CPU (logical address) a segment table is provided. A logical address is consists of two parts: a segment number (s) and an offset in the segment (d). The segment number is used in the segment table as an index. Each entry of the segment table has a segment base and a segment limit. The hardware organization of this scheme is as follows:
15
Memory_Managemen t
Memory
< N
Addressing error
Example We illustrate the segmentation mechanism with the following example:
Stack Segment 3 Subroutine X Segment 0 Function SQRT Segment 1 Main Program Segment 2 Logical Address Space Symbol Table Segment 4 Limit Base 0 1 2 4 1000 400 400 1000 1400 6300 4300 3200 4700 Seg 1 Seg 3 Seg 2 Seg 4 Seg 0
3 1100
Segment Table
Physical Memory
There are five segments 0 through 4. Segments are stored in the physical memory as shown. The segment table has a separate entry for each segment, giving the beginning address of the segment in the memory (the base) and the length of that segment (the limit). For example, segment 2 is 400 words long, beginning at location 4300. Thus a reference to a word 53 of segment 2 is mapped onto location 4300 + 53 = 4353. A reference to segment 3, word 852 is mapped to 3200 + 852 = 4052. A reference to word 1222 of segment 0 would result in a trap to the O/S, since segment 0 is only 1000 words long.
16
Memory_Managemen t
Virtual Memory
So far we managed to eliminate all the restrictions on program execution except: The complete program must be loaded into the memory before execution can begin. A careful analysis of program behavior indicates that a program does not require the entire to begin execution. Example: read (a,b,c); if a > b + c then go to look else begin ---------------end; ----------look: begin writeln ('Error in reading data. Check data'); ------end;
A program during execution needs only one instruction at a time. This behavior of the program implies that an instruction should be loaded into the memory only when required by the CPU. Example: If a program declares an array of 100 100 but uses only 30 x 30 then 100 x 100 array would have to be in the memory occupying 100 100 2 = 20000 = 20K (approx.). 18K memory will go waste. So for efficient memory utilization use demand strategy. Demand strategy: Load the entry point of the program only and the other parts of the program is loaded when CPU asks for it. Further investigation: It should be noted that the semantics and syntax of a program are not related at all with the way memory is allocated to the program. This means a program written in any language will be treated in exactly the same way, i.e., the binary of a program is completely independent of the high level language. Of course the syntax of some of the language is such that it helps the memory management technique to optimize the utilization of the memory. Language like ALGOL W or ALGOL 68 has dynamic array structure where space for array is allocated during the run time. This should not be a common criteria to build the OS. The address space may be larger than the physical space. A successful execution of programs under this environment requires that a. b. c. d. Correct program entry point should be available to the CPU. CPU should be able to generate its requirement. The requirement should be interpreted correctly and error should be detected. The requirement should be made available as soon as possible to the CPU.
17
Memory_Managemen t Implementation of demand strategy: The requested parts are loaded (overlaid) on that part of memory which was occupied by that part of the program which is no longer needed. This overlay activity is managed by Overlay Driver. The following diagram illustrates the structure. Consider a two-pass assembler. Pass1: Construct a symbol table. Code size: 8K. Pass2: Generate machine code. Code size: 10K.
Symbol table size: 14K. Common routines: 5K. Memory requirement: 37K. Memory available: 32K This can be managed as follows (overlay driver in the memory): 1. Load pass1, symbol table and common routines from disk. OK, since pass1 does not need pass 2. 2. Load pass2, symbol table and common routines from disk. OK, since it does not need pass 1. This arrangement requires some special registers to handle overlay.
SymbolTable
14K 5K 2K Pass 2
Problems: Since overlay involves loading correct parts of the program, the program must be structured in a special way. The programmer is must have a complete knowledge of the program execution and its data structures. This may become very difficult for large programs. If an overlay error occurs then the program may have to be restructured. A method similar to overlay is Dynamic Loading. The idea here is similar and it loads program modules on demand. Program modules may be a procedure, a function, a subroutine, library routines etc. Demand Paging : Load program pages when demanded by the CPU. System architecture:
Logical space 0 1 2 3 4 5 A B C D E F 0 1 2 3 4 5
18
Memory_Managemen t
Terms
Page Fault (Non-Equivalence): An event. It occurs when the desired page is not in the main memory. A page fault for a missing page is indicated with i in the page table. Page Replacement: An action. It decides which page from a completely full main memory should be moved to disk to make room for the demanded page and transfers the demanded page from disk Mechanism: The mechanism of demand paging is similar to ordinary paging. A logical address is generated which is mapped onto a physical address by the paging hardware. The page may or may not be in the physical memory. In the latter case the page is located on the disk and is moved to the memory. The demand paging differs only on this point from the simple paging. The following diagram explains the entire sequence of processing a page-fault.
OS
Load M
i 5 Page table
Process pages
Steps 1. 2. 3. 4. 5. 6.
CPU generates logical address for LOAD M. A reference to the page table indicates that the page is not in the memory and gives its disk address. A trap is generated. System goes in privilege mode. The page is located on the disk. Room is made available in memory and the page is moved. Page table is updated to indicate that the page is in memory. The instruction is restarted (system in user mode).
Implementation problems 1. If the page fault occurs on the instruction fetch, restart by fetching the instruction. 2. If a page fault occurs while fetching an operand, re-fetch the instruction, decode it again, and then fetch the operand. Consider the instruction: ADD A B a. Fetch and decode ADD. b. Fetch A. c. Fetch B. d. Add A and B. e. Store the sum in C.
ADD and operand A on page 1. Operand B on page 5. Page 1 is in the memory and page 5 on the disk. A page fault occurs when CPU tries to access B. Execution of the instruction incomplete. Execution suspended. Room is made in the memory and the page is moved.
19
Memory_Managemen t Page table updated. ADD A B starts from the beginning. 3. Problem occurs when one instruction may modify several different locations. Example: MOVE CHARACTER instruction. IBM 370 MVC instruction can move up to 256 characters from one location to another location. A page fault may occur just before the last move is completed. Partial result is saved in the memory and the operation resumes from this point when the desired page is brought into the memory. In auto-increment and auto-decrement on PDP (VAX also). In auto-increment addressing the address is incremented by a word (2 bytes) after the operand has been fetched from the last instruction. In auto-decrement addressing mode first the address is decremented and then the resulting address is used to fetch the operand. Example: MOV (R2)+, -(R3). Now if a page fault occurs when accessing the operand pointed by register R3, the execution is suspended and the desired page is brought into the memory, page table is updated and the execution resumes from the beginning. The original contents of R2 and R3 is saved using a special register SR1. After the page fault R2 and R3 receive their original contents and execution resumes.
4.
FIFO: simplest replacement algorithm. It removes the oldest page from the memory to make room for incoming page. Example with a page reference string: 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2017 01.
7 7
0 7 0
1 7 1 0
2 2 0 1
0 2 3 1
3 2 3 0
0 4 3 0
4 4 2 0
2 4 2 3
3 0 2 3
2 0 1 3
1 0 1 2
0 7 1 2
1 7 0 2
7 7 0 1
FIFO page replacement Problem: FIFO suffers from what is called Belady's anomaly where at some time the page fault rate may increase as the number memory frames increases. Example Page reference string: A, B, C, D, A, B, E, A, B, C, D, E. Memory can hold only one page. Which page in memory A B C D A B E A B C D E Which page moved out of memory NONE A B C D A B E A B C D
CPU requires Page Fault page A Y B Y C Y D Y A Y B Y E Y A Y B Y C Y D Y E Y Total number of page fault = 12.
20
Memory_Managemen t Intuitively, more number of memory frames less number of page faults. Let us check this. Page reference string: CPU requires page A B C D A B E A B C D E Number of page faults = 9 Page reference string: CPU requires page A B C D A B E A B C D E Number of page faults = 10. Optimal Replacement: On a page fault, replace that page which will not be required for the longest period of time. An optimal page replacement algorithm generates the lowest number of page faults. It never suffers from Belady's anomaly. Example A, B, C, D, A, B, E, A, B, C, D, E. Memory can hold four pages. Page Fault Y Y Y Y N N Y Y Y Y Y Y Which page in memory A B C D NONE NONE E A B C D E Which page moved out of memory NONE NONE NONE NONE NONE NONE A B C D E A A, B, C, D, A, B, E, A, B, C, D, E. Memory can hold three pages. Page Fault Y Y Y Y Y Y Y N N Y Y N Which page in memory A B C D A B E NONE NONE C D NONE Which page moved out of memory NONE NONE NONE A B C D NONE NONE A B NONE
7 7
0 7 0
1 7 0 1 2 0
0 2 0 3
0 2 4 3
3 2 0 3
2 2 0 1
0 7 0 1
21
Memory_Managemen t Implementation of this algorithm is not possible since it requires future knowledge of the reference string. However, the algorithm can serve as a standard for measuring the efficiency of other algorithms. Least Recently Used (LRU): pages that have not been used for the longest period of time can be removed from the memory. This is the Optimal page replacement algorithm looking backward in time, rather than forward. Example
7 7
0 7 0 7 0
1 2 0 1
0 2 0 3
0 4 0 3
4 4 0 2
2 4 3 2
3 0 3 2
3 1 3
1 1 0 2
0 7 0 1
Implementation of LRU Method 1: Use a clock. It is incremented whenever the page is referenced. Each page table entry has its own clock and when a page is to be replaced, all page tables are searched to find the page whose reference time is the lowest. Method 2: Use a doubly linked list. Under this implementation, the most recently referenced page is moved to the top of the list. The bottom page on the list is removed when a page fault occurs. Implementation of LRU using a matrix In this scheme the hardware maintains a matrix of n x n bits where n is the number of memory page frames. Initially all bits of this matrix is set to zero indicating no reference is made. Whenever a page is referenced the corresponding row bits of the matrix is set to 1's and the column bits are set to 0's. At any instant whose row value is the lowest is the least recently used and is a candidate for replacement. The working of the algorithm for 4 page frame memory is given in the following figure. Page 2(1) means incoming page is 2 and it replaces the page stored at memory frame 1. Reference string: 0 1 2 3 4 5 0 3 2 3 3 1 4 5 0 1
0 1 2 3
0 1 0 1 0 0 0 0 0 0
2 3 1 1 0 0 0 0 0 0
0 1 2 3
0 1 0 0 1 0 0 0 0 0
2 3 1 1 1 0 0 1 0 0
0 1 2 3
0 1 0 0 1 0 1 1 0 0
2 3 0 1 0 0 0 1 1 0
0 1 2 3
0 1 0 0 1 0 1 1 1 1
2 3 0 0 0 0 1 0 0 0
4 1 2 3
4 1 0 1 0 0 0 1 0 1
2 3 1 1 0 0 1 0 0 0
Page 0 4 5 0 0 1 0 0 0 0 0 2 3 1 1 1 0 1 1 0 0
Page 1 4 5 0 1 1 0 1 1 0 0 0 3 0 1 0 0 0 1 1 0
Page 2 4 5 0 1 1 0 1 1 1 1 0 3 0 0 0 0 1 0 0 0
Page 3 2 5 0 1 0 0 0 1 0 1 0 3 1 1 0 0 1 0 0 0
Page 4 2 4 0 0 1 0 0 0 0 0 0 3 1 1 1 0 1 1 0 0
4 5 2 3
4 5 0 3
4 5 0 3
2 5 0 3
2 4 0 3
Page 5
Page 0
Page 3
Page 2
Page 4
Least Frequently Used (LFU): Replace the page that is least frequently used or least intensively referenced. In this replacement a page which has been brought most recently may be selected for removal from the memory.
22
Memory_Managemen t Most Frequently Used: Pages which have been used most frequently are removed. Not Used Recently: Pages not used recently are not likely to be used in the near future and they may be replaced. Minimum number of frames: What should be the minimum number of frames must be available for an instruction to be executed? Example: Suppose a system provides only one level of indirect addressing. In this case, the first operand contains the address of the location which holds the operand value. To executed the instruction SUB A (B) a minimum of three pages are required and the execution can be illustrated as follows:
0 1
0 1, 2
0 1, 2, 3
0 3, 1
PDP 8 requires 3 pages, PDP 11 requires 6 (at least) and IBM 370 requires 8. Nova 3 (Data General) allows multiple levels of indirection. In the worst case (theoretically) an instruction can reference entire virtual address space. So minimum number of frames per process is defined by the architecture of the instruction set while the maximum number by the size of available memory. Locality: Defines the locality of access. The observation is that processes tend to reference storage in nonuniform, highly localized pattern. It is an empirical (observed) rather than a theoretical property. Two kinds: Temporal Locality: Locality over time. Example: If the weather is sunny at 3 p.m., then there is good chance (but certainly no guarantee) that the weather was sunny at 2.40 p.m., and will be sunny at 3.30 p.m. Spatial Locality: Nearby items tend to be similar. Example: If it is sunny in one town then it is likely (but not guaranteed) that it would be sunny in the next towns.
Locality in OS environment
23
Memory_Managemen t Processes tend to favor certain subsets of their pages (temporal locality), and these pages often tend to be adjacent to one another (spatial locality). This does not mean that a process won't make a reference to a new page. Example (Temporal Locality): Storage locations referenced recently are likely to be referenced in the near future. Supporting this observation are: a. looping. b. subroutines. c. stacks and variables used for counting and totaling. Example (Spatial Locality): Storage references tend to be clustered so that once a location is referenced, it is highly likely that nearby locations will be referenced. Supporting this observation are: a. array traversals b. sequential code execution c. the tendency of programmers to place related variable definitions near one another. Process execution efficiency would be high if its favored subset of pages is in the memory. A program may finish one locality and move to another locality. For example, migration from one subroutine/procedure to another. This implies that a program is made up of several locality. The idea of locality gave rise to a storage allocation policy called Working Set Model.
2 6 1 5 7 7 7 7 5 1 6 2 3 4 1 2 3 4 4 4 3 4 3 4 4 4 1 3 2 3 4
t1
t2
Let P1, P2 and P3 are three process in the memory. The working set size of P1 = WSS1. The working set size of P2 = WSS2. The working set size of P3 = WSS3. For efficient execution of P1, P2 and P3 the total number of pages which must be in the memory is:
WSSi
1
24
Memory_Managemen t Thrashing: Progress in process execution is less than the paging activity. Suppose P1 and P2 processes do not have their complete working sets in the memory. Result 1. 2. 3. 4. Very high page fault high consequently very high paging traffic. CPU will be busy, most of the time, in moving pages around. No progress in process execution. Program response time and the throughput will go down. When the throughput declines, the scheduler would think that there are not many processes in the memory and so it will schedule new processes. This will make the situation worse and the CPU will get more and more busy in paging traffic and the throughput would decline further. This phenomenon is called Thrashing.
Problem with working set: Supporting the dynamic nature of working set window. Its behavior cannot be predicted correctly. On the basis of program behavior one can only give approximate idea of window size. Global Versus Local Allocation: We can apply the page replacement algorithms in two different ways: 1. 2. On a page fault select a page from the entire active process page table. This is global page allocation. On a page fault replace pages from the process page table which generated the page fault. Local page allocation.
The problem with global allocation is that the execution of one process may affect the execution of other processes. A process has variable number of pages in the memory. The advantage is that a process may not have to wait for memory frames longer than it would under local allocation. Page Size: Explain in the class.
25