UNIT 3 Notes-OS
UNIT 3 Notes-OS
Memory management is the functionality of an operating system which handles or manages primary
memory and moves processes back and forth between main memory and disk during execution.
Memory management keeps track of each and every memory location, regardless of either it is
allocated to some process or it is free. It checks how much memory is to be allocated to processes.
It decides which process will get memory at what time.
It tracks whenever some memory gets freed or unallocated and correspondingly it updates the status.
1.1 Basic Hardware
Program must be brought (from disk) into memory and placed within a process for it to be run
Main memory and registers are only storage CPU can access directly
Memory unit only sees a stream of addresses + read requests, or address + data and write requests
Register access in one CPU clock (or less)
Main memory can take many cycles, causing a stall
Cache sits between main memory and CPU registers
Protection of memory required to ensure correct operation
We can provide this protection by using two registers, usually a base and a limit
The base register holds the smallest legal physical memory
address; The limit register specifies the size of the range.
For example, if the base register holds 300040 and limit register is 120900, then the
program can legally access all addresses from 300040 through 420940 (inclusive).
Protection of memory space is accomplished by having the CPU hardware compare every address
generated in user mode with the registers.
Any attempt by a program executing in user mode to access operating-system memory or other
memory results in a trap to the operating system, which treats the attempt as a fatal error.
This scheme prevents a user program from (accidentally or deliberately) modifying the code or data
structures of either the operating system or other users.
2. Load time. The compiler translates symbolic addresses to relative (relocatable) addresses.
The loader translates these to absolute addresses. If it is not known at compile time where the
process will reside in memory, then the compiler must generate relocatable code (Static).
3. Execution time. If the process can be moved during its execution from one memory segment
to another, then binding must be delayed until run time. The absolute addresses are generated by
hardware. Most general-purpose OS use this method (Dynamic).
1.3 Logical vs. Physical Address Space
Logical and physical addresses are the same in compile-time and load-time address-
binding schemes
Logical (virtual) and physical addresses differ in execution-time address- binding scheme
to location 346, store it in memory, manipulate it, and compare it with other addresses -all as
the number 346.
1.4 Dynamic Loading
Dynamic loading is a mechanism by which a computer program can, at run time, load
a library (or other binary) into memory, retrieve the addresses of functions and variables
contained in the library, execute those functions or access those variables, and unload the library
from memory.
Dynamic loading means loading the library (or any other binary for that matter) into the memory
during load or run-time.
Dynamic loading can be imagined to be similar to plugins, that is an exe can actually
execute before the dynamic loading happens(The dynamic loading for example can be
created using Load Library call in C or C++)
Dynamic linking refers to the linking that is done during load or run-time and not when the
exe is created.
In case of dynamic linking the linker while creating the exe does minimal work. For the
dynamic linker to work it actually has to load the libraries too. Hence it's also called linking
loader.
Small piece of code, stub, used to indicate how to load library routine.
Stub replaces itself with the address of the routine, and executes the routine.
Operating system needed to check if routine is in processes memory address.
Dynamic linking is particularly useful for libraries.
Shared libraries: Programs linked before the new library was installed will continue using
the older library.
2. SWAPPING
2.1 Basic
A process can be swapped temporarily out of memory to a backing store (SWAP OUT)and
then brought back into memory for continued execution (SWAP IN).
Backing store fast disk large enough to accommodate copies of all memory
images for all users & it must provide direct access to these memory images
Roll out, roll in swapping variant used for priority-based scheduling algorithms; lower-
priority process is swapped out so higher-priority process can be loaded and executed
Transfer time: Major part of swap time is transfer time. Total transfer time is
directly proportional to the amount of memory swapped.
Example: Let us assume the user process is of size 1MB & the backing store is a standard hard
disk with a transfer rate of 5MBPS.
Transfer time = 1000KB/5000KB per second
= 1/5 sec = 200ms
A process with dynamic memory requirements will need to issue system calls (request
memory() and release memory()) to inform the operating system o f its changing memory needs.
2.2 Swapping on Mobile Systems
Swapping is typically not supported on mobile platforms, for several reasons:
Mobile devices typically use flash memory in place of more spacious hard drives
for persistent storage, so there is not as much space available.
Flash memory can only be written to a limited number of times before it
becomes unreliable.
The bandwidth to flash memory is also lower.
Apple's IOS asks applications to voluntarily free up memory
Read-only data, e.g. code, is simply removed, and reloaded later if needed.
Modified data, e.g. the stack, is never removed.
Apps that fail to free up sufficient memory can be removed by the
OS Android follows a similar strategy.
Prior to terminating a process, Android writes its application state to flash memory for
quick restarting.
3. CONTIGUOUS MEMORY ALLOCATION
One approach to memory management is to load each process into a contiguous space.
The operating system is allocated space first, usually at either low or high memory locations, and then
the remaining available memory is allocated to processes as needed.
3.1 Memory Protection
Protection against user programs accessing areas that they should not, allows programs to be
relocated to different memory starting addresses as needed, and allows the memory space devoted to
the OS to grow or shrink dynamically as needs change.
o As processes complete and leave they create holes in the main memory.
o Hole block of available memory; holes of various size are scattered throughout
memory.
Dynamic Storage- Allocation Problem:
Solution:
Example :
Given five memory partitions of 100 KB, 500 KB, 200 KB, 300 KB, and 600 KB (in order), how would
each of the first-fit, best-fit, and worst-fit algorithms place processes of 212 KB, 417 KB, 112 KB, and
426 KB (in order)?Which algorithm makes the most efficient use of memory?
a. First-fit:
1. 212K is put in 500K partition
2. 417K is put in 600K partition
3. 112K is put in 288K partition (new partition 288K = 500K 212K)
4. 426K must wait
b. Best-fit:
1. 212K is put in 300K partition
2. 417K is put in 500K partition
3. 112K is put in 200K partition
4. 426K is put in 600K partition
c. Worst-fit:
l. 212K is put in 600K partition
2. 417K is put in 500K partition
3. 112K is put in 388K partition
4. 426K must wait
In this example, best-fit turns out to be the best.
NOTE: First-fit and best-fit are better than worst-fit in terms of speed and storage utilization
3.3Fragmentation:
Fragmentation is a phenomenon in which storage space is used
inefficiently, reducing capacity or performance and often both.
1. External Fragmentation This takes place when enough total memory space
exists to satisfy a request, but it is not contiguous i.e, storage is fragmented into a large
number of small holes scattered throughout the main memory.
2. Internal Fragmentation Allocated memory may be slightly larger than
requested memory.
Example: hole = 184
bytes Process size =
182 bytes.
We are left with a hole of 2 bytes.
Solutions
Compaction: Move all processes towards one end of memory, hole towards other
end of memory, producing one large hole of available memory. This scheme is
expensive as it can be done if relocation is dynamic and done at execution time.
4. SEGMENTATION
Each entry in the segment table has a segment base and a segment limit.
The segment base contains the starting physical address where the segment resides in
memory, and the segment limit specifies the length of the segment
A logical address consists of two parts:
a segment number
s, and an offset into that segment d. The
segment number is used as an index to the segment table.
The offset d of the logical address must be between 0 and the segment limit.
If it is not, we trap to the operating system (logical addressing attempt beyond end of
segment).
When an offset is legal, it is added to the segment base to produce the address in physical
memory of the desired byte.
For example,
segment 2 is 400 bytes long and begins at location 4300. Thus, a reference to byte 53 of segment 2 is
mapped onto location 4300 + 53 = 4353. A reference to segment 3, byte 852, is mapped to 3200 (the
base of segment 3) + 852 = 4052. A reference to byte 1222 of segment 0 would result in a trap to the
operating system, as this segment is only 1,000 bytes long.
5. PAGING
This base address is combined with the page offset to define the physical memory address that is
sent to the memory unit.
Consider the memory in the logical address, n= 2 and m = 4. Using a page size of 4 bytes and a
physical memory of 32 bytes (8 pages), we show how the view of memory can be
mapped into physical memory. Logical address 0 is page 0, offset 0. Indexing into the page
table, we find that page 0 is in frame 5.
Thus, logical address 0 maps to physical address 20 [= (5 × 4) + 0]. Logical address 3 (page 0,
offset 3) maps to physical address 23 [= (5 × 4) + 3]. Logical address 4 is page 1, offset 0;
according to the page table, page 1 is mapped to frame 6. Thus, logical address 4 maps to
physical address 24 [= (6 × 4) + 0]. Logical address 13 maps to physical address 9.
Since the operating system is managing physical memory, it must be aware of the allocation details
of physical memory, which frames are allocated, which frames are available, how many total frames
there are, and so on.
When the associative memory is presented with an item, the item is compared with all
keys simultaneously.
If the item is found, the corresponding value field is returned.
The TLB contains only a few of the page-table entries.
When a logical address is generated by the CPU, its page number is presented to the TLB.
If the page number is not in the TLB (known as a TLB miss), a memory reference
to the page
table must be made.
Depending on the CPU, this may be done automatically in hardware or via an interrupt to the
operating system.
If the page number is found, its frame number is immediately available and is used to access
Memory.
Hit Ratio - The percentage of times that the page number of interest is found in the TLB is
called the
hit ratio.
An 80-percent hit ratio, for example, means that we find the desired page number in the TLB 80
percent of the time. If it takes 100 nanoseconds to access memory, then a mapped-memory
access takes 100 nanoseconds when the page number is in the TLB.
If we fail to find the page number in the TLB then we must first access memory for the page
table and frame number (100 nanoseconds) and then access the desired byte in memory
(100
nanoseconds), for a total of 200 nanoseconds.
effective access time = 0.80 × 100 + 0.20 × 200
= 120 nanoseconds
For a 99-percent hit ratio, which is much more realistic, we have effective access time =
0.99 × 100 + 0.01 × 200 = 101 nanoseconds
5.3 Protection
Memory protection in a paged environment is accomplished by protection bits associated with each
frame.
One additional bit is generally attached to each entry in the page table: a valid invalid bit.
When this bit is set to valid, the associated page is in the logical address space and is thus
a legal (or valid) page.
When the bit is set to invalid, the page is not in the logical address space. Illegal addresses
are trapped by use of the valid invalid bit.
5.4 Shared Pages
An advantage of paging is the possibility of sharing common code.
The most common techniques for structuring the page table, including hierarchical paging, hashed
page tables, and inverted page tables.
1. Hierarchical Paging
The page table itself becomes large for computers with large logical address space (232 to 264).
Example:
r a system with a 32-bit logical address space. If the page size in such a system is 4
KB(212), then a page table may consist of up to 1 million entries (232/212).
where
p1 - an index into the outer page table
p2 - the displacement within the page of the inner page table.
The address-translation method for this architecture is shown in the figure. Because address
translation
works from the outer page table inward, this scheme is also known as a forward-mapped page table.
physical
address.
or a matching virtual
page number.
IA-32 Segmentation
The Pentium CPU provides both pure segmentation and segmentation with paging. In the latter case,
the CPU generates a logical address ( segment-offset pair ), which the segmentation unit converts into
a logical linear address, which in turn is mapped to a physical frame by the paging unit
IA-32 Segmentation
The Pentium architecture allows segments to be as large as 4 GB, ( 24 bits of offset ).
Processes can have as many as 16K segments, divided into two 8K groups:
8K private to that particular process, stored in the Local Descriptor Table, LDT.
8K shared among all processes, stored in the Global Descriptor Table, GDT.
Logical addresses are ( selector, offset ) pairs, where the selector is made up of 16 bits:
A 13 bit segment number ( up to 8K )
A 1 bit flag for LDT vs. GDT.
2 bits for protection codes.
The descriptor tables contain 8-byte descriptions of each segment, including base and limit registers.
Logical linear addresses are generated by looking the selector up in the descriptor table and adding
the appropriate base address to the offset.
IA-32 Paging
Pentium paging normally uses a two-tier paging scheme, with the first 10 bits being a page number
for an outer page table ( a.k.a. page directory ), and the next 10 bits being a page number within one
of the 1024 inner page tables, leaving the remaining 12 bits as an offset into a 4K page.
A special bit in the page directory can indicate that this page is a 4MB page, in which case the
remaining 22 bits are all used as offset and the inner tier of page tables is not used.
The CR3 register points to the page directory for the current process.
If the inner page table is currently swapped out to disk, then the page directory will have an "invalid
bit" set, and the remaining 31 bits provide information on where to find the swapped out page table
on the disk.
x86-64
The initial entry of Intel developing 64-bit architectures was the IA-64 (later named
Itanium) architecture, but was not widely adopted.
began developing a 64-bit architecture known as x86-64 that was
based on extending the existing IA-32 instruction set.
-64 supported much larger logical and physical address spaces, as well as several
other architectural advances.
-bit address space yields an astonishing 264 bytes of addressable
memory a number greater than 16 quintillion (or 16 exabytes).
8. IRTUAL MEMORY
o It is a technique that allows the execution of processes that may not be completely in main
memory.
Virtual memory is the separation of user logical memory from physical memory. This
separation allows an extremely large virtual memory to be provided for programmers
when only a smaller physical memory is available.
Only part of the program needs to be in memory for execution.
Logical address space can therefore be much larger than physical address space.
Need to allow pages to be swapped in and out.
o Advantages:
Allows the program that can be larger than the physical memory.
Separation of user logical memory from physical memory
Allows processes to easily share files & address space.
Allows for more efficient process creation.
o Virtual memory can be implemented using
Demand paging
Demand segmentation
9. DEMAND PAGING
9.1 Concept
The basic idea behind demand paging is that when a process is swapped in, its pages are not swapped
in all at once. Rather they are swapped in only when the process needs them (On demand). This is
termed as lazy swapper.
Advantages
Less I/O needed
Less memory needed
Faster response
More users
1. We check an internal table (usually kept with the process control block) for this process to
determine whether the reference was a valid or an invalid memory access.
2. If the reference was invalid, we terminate the process. If it was valid but we have not yet brought
in that page, we now page it in.
3. We find a free frame (by taking one fromthe free-frame list, for example).
4. We schedule a disk operation to read the desired page into the newly allocated frame.
5. When the disk read is complete, we modify the internal table kept with the process and the page
table to indicate that the page is now in memory.
6. We restart the instruction that was interrupted by the trap. The process can now access the page
as though it had always been in memory.
This is the simplest page replacement algorithm. In this algorithm, operating system keeps
track of all pages in the memory in a queue, oldest page is in the front of the queue. When
a page needs to be replaced page in the front of the queue is selected for removal.
Example:
Implementation of LRU
1. Counter
The counter or clock is incremented for every memory reference.
Each time a page is referenced , copy the counter into the time- of-use field.
When a page needs to be replaced, replace the page with the smallest
counter value.
2. Stack
Keep a stack of page numbers
Whenever a page is referenced, remove the page from the stack and put it on
top of the stack.
When a page needs to be replaced, replace the page that is at the bottom
of the stack.(LRU page)
o Reference bit
With each page associate a reference bit, initially set to 0
When page is referenced, the bit is set to 1
o When a page needs to be replaced, replace the page whose reference bit is 0
o The order of use is not known , but we know which pages were used and which were not
used.
oIf reference bit is 00000000 then the page has not been used for 8 time periods.
oIf reference bit is 11111111 then the page has been used atleast once each time
period. oIf the reference bit of page 1 is 11000100 and page 2 is 01110111 then
page 2 is the LRU
page.
(ii) Second Chance Algorithm
oBasic algorithm is FIFO
oWhen a page has been selected , check its reference bit.
If 0 proceed to replace the page
If 1 give the page a second chance and move on to the next FIFO page.
When a page gets a second chance, its reference bit is cleared and
arrival time is reset to current time.
Hence a second chance page will not be replaced until all other
pages are replaced.
(iii) Enhanced Second Chance Algorithm
o Consider both reference bit and modify bit o
There are four possible classes
1. (0,0) neither recently used nor modified est page to replace
2. (0,1) not recently used but modifiedpage has to be written out
before replacement.
3. (1,0) - recently used but not modified page may be used again
4. (1,1) recently used and modifiedpage may be used again and
page has to be written to disk
frames Then S = si
ai = si / S * m
o Global replacement each process selects a replacement frame from the set of
all frames; one process can take a frame from another.
o Local replacement each process selects from only its own set of allocated frames.
12. THRASHING
Thrashing
o If global replacement is used then as processes enter the main memory they tend to steal
frames belonging to other processes.
o Eventually all processes will not have enough frames and hence the page fault rate becomes
very high.
o Thus swapping in and swapping out of pages only takes place.
o This is the cause of thrashing.
1. Working-Set Strategy
If a page is in active use, it will be in the working set. If it is no longer being used, it will drop
from the working set time units after its last reference.
e selection of .
several localities.
process execution.
e most important property of the working set, then, is its size.
-set size, WSSi , for each process in the system, we can then
consider that
is the total demand for frames. Each process is actively using the pages in
its working set.
available frames (D> m), thrashing will occur, because some processes will not have enough
frames.
Thrashing has a high page-fault rate. Thus, we want to control the page-fault rate.
When it is too high, we know that the process needs more frames. Conversely, if the page-fault rate is too
low, then the process may have too many frames.
We can establish upper and lower bounds on the desired page-fault rate.
If the actual page-fault rate exceeds the upper limit, we allocate the process another frame.
If the page-fault rate falls below the lower limit, we remove a frame from the process.
Thus, we can directly measure and control the page-fault rate to prevent thrashing.
Buddy System
The buddy system allocates memory from a fixed -size segment consisting of physically
contiguous pages. Memory is allocated from this segment using a power-of-2 allocator, which
satisfies requests in units sized as a power of 2 (4KB, 8KB, 16KB, and so forth). A request in units
not appropriately sized is rounded up to the next highest power of 2. For example, a request for 11 KB
is satisfied with a 16K segment
The segment is initially divided into two buddies which we will call AL and AR each 128 KB in
size. One of these buddies is further divided into two 64-KB buddies BL and BR.
A second strategy for allocating kernel memory is known as slab allocation. A slab is made up of one
or more physically contiguous pages. A cache consists of one or more slabs.
When a cache is created, a number of objects which are initially marked as free are allocated to the
cache. The number of objects in the cache depends on the size of the associated slab.
For example, a 12-KB slab (made up of three contiguous 4-KB pages) could store six 2-KB objects.
The slab allocator first attempts to satisfy the request with a free object in a partial slab.
If no empty slabs are available, a new slab is allocated from contiguous physical pages and assigned
to a cache; memory for the object is allocated from this slab.
14. SEGMENTATION WITH PAGING
o The IBM OS/ 2.32 bit version is an operating system running on top of the Intel 386 architecture.
The 386 uses segmentation with paging for memory management. The maximum number of
segments per process is 16 KB, and each segment can be as large as 4 gigabytes.
o The local-address space of a process is divided into two partitions.
The first partition consists of up to 8 KB segments that are private to that process.
The second partition consists of up to 8KB segments that are shared
among all the processes.
o Information about the first partition is kept in the local descriptor table (LDT),
information about the second partition is kept in the global descriptor table (GDT).
o Each entry in the LDT and GDT consist of 8 bytes, with detailed information about a
particular segment including the base location and length of the segment.
The logical address is a pair (selector, offset) where the selector is a16-bit number:
s g p
13 1 2
p1 p2 d
10 10 12
oTo improve the efficiency of physical memory use. Intel 386 page tables can
be swapped to disk. In this case, an invalid bit is used in the page directory entry to indicate
whether the table to which the entry is pointing is in memory or on disk.
oIf the table is on disk, the operating system can use the other 31 bits to specify the disk location
of the table; the table then can be brought into memory on demand.