Os Module 4 Notes
Os Module 4 Notes
Chapter 9
VIRTUAL MEMORY MANAGEMENT
Virtual memory is a technique that allows the execution of processes that are not
completely in memory.
The main advantage of this scheme is that programs can be larger than physical memory.
Virtual memory also allows processes to share files easily and to implement shared
memory. It also provides an efficient mechanism for process creation.
But virtual memory is not easy to implement.
9.1 Background
An examination of real programs shows us that, in many cases, the entire program is
not needed to be in physical memory to get executed.
Even in those cases where the entire program is needed, it may not need all to be at
the same time.
The ability to execute a program that is only partially in memory would confer many
benefits,
o A program will not be limited by the amount of physical memory that is available.
o More than one program can run at the same time which can increase the throughput
and CPU utilization.
o Less I/O operation is needed to swap or load user program in to memory. So each
user program could run faster.
Virtual memory involves the separation of user’s logical memory from physical memory. This
separation allows an extremely large virtual memory to be provided for programmers when
there is small physical memory as shown in below figure.
Figure 9.1 Diagram showing virtual memory that is larger than physical memory
The virtual address space of a process refers to the logical (or virtual) view of how a
process is stored in memory.
In below figure, we allow heap to grow upward in memory as it is used for dynamic
memory allocation and stack to grow downward in memory through successive
function calls.
The large blank space (or hole) between the heap and the stack is part of the virtual
address space but will require actual physical pages only if the heap or stack grows.
Virtual address spaces that include holes are known as sparse address spaces.
Basic concepts
We need some form of hardware support to distinguish between the pages that are
in memory and the pages that are on the disk.
The valid-invalid bit scheme can provide this .If the bit is valid then the page is both
legal and is in memory. If the bit is invalid then either the page is not valid or is valid
but is currently on the disk.
The page-table entry for a page that is brought into memory is set as valid but the
page-table entry for a page that is not currently in memory is either simply marked
invalid or contains the address of the page on disk as shown in below figure.
Access to the page which is marked as invalid causes a page fault trap.
Figure 9.5 Page table when some pages are not in main memory.
The steps for handling page fault is straight forward and is shown in below figure,
1. We check the internal table usually PCB (Process Control Block) of the process to
determine whether the reference made is valid or invalid.
2. If invalid, terminate the process. If valid, then the page is not yet loaded and we now
page it in.
3. We find a free frame.
4. We schedule disk operation to read the desired page in to newly allocated frame.
5. When disk read is complete, we modify the internal table kept with the process to indicate
that the page is now in memory.
6. We restart the instruction which was interrupted by the trap. The process can now access
the page.
In extreme cases, we can start executing the process without pages in memory. When
the OS sets the instruction pointer of process which is not in memory, it generates a
page fault. After this, page is brought in to memory then the process continues to
execute and faulting every time until every page that it needs is in memory. This scheme
is known as pure demand paging. That is, it never brings the page in to memory until
it is required.
The hardware support for demand paging is same as paging and swapping.
o Page table: It has the ability to mark an entry invalid through valid-invalid bit.
o Secondary memory: This holds the pages that are not present in main memory. It
is a high speed disk. It is known as the swap device, and the section of disk used
for this purpose is known as swap space.
A crucial requirement for demand paging is the need to be able to restart any
instruction after a page fault.
A page fault may occur at any memory reference. If the page fault occurs on
instruction fetch, we can restart by fetching the instruction again.
If a page fault occurs while we are fetching an operand, we must fetch and decode the
instruction again and then fetch the operand.
As an example, consider three-address instruction such as ADD the content of A to
B, placing the result in C. These are the steps to execute this instruction:
1. Fetch and decode the instruction (ADD).
2. Fetch A.
3. Fetch B.
4. Add A and B.
5. Store the sum in C.
If we fault when we try to store in C (because C is in a page not currently in memory),
we have to get the desired page, bring it into memory, correct the page table, and restart
the instruction.
The restart will require fetching the instruction again, decoding it again, fetching the
two operands again, and then adding again.
To compute the effective access time, we must know how much time is needed to
service a page fault. A page fault causes the following sequence to occur,
With an average page-fault service time of 8 milliseconds and a memory access time
of 200 nanoseconds, the effective access time in nanoseconds is
Effective Access Time = (1 - p) * (200) + p (8 milliseconds)
= (1 - p) * 200 + p * 8,000,000
= 200 + 7,999,800 * p.
The effective access time is directly proportional to the page-fault rate.
If one access out of 1,000 causes a page fault, the effective access time is 8.2
microseconds. The computer will be slowed down by a factor of 40 because of
demand paging. If we want performance degradation to be less than 10 percent,
then,
10% of 200 ns = 20, ie., 220 ns.
So,
220 > 200 + 7,999,800 * p,
20 > 7,999,800 * p,
P < 7,999,800 ÷ 20
or
7,999,800 ÷ 20 = 3,99,999
From this we can conclude that, allow one page out of 3,99,990 to fault. So
memory access time will be less than 10%.
9.3 Copy-on-write
Copy-on-write technique allows both the parent and the child processes to share the
same pages. These pages are marked as copy-on-write pages i.e., if either process
writes to a shared page, a copy of shared page is created.
Copy-on-write is illustrated in below figures, which shows the contents of the physical
memory before and after process 1 modifies page C.
If the total memory requirement exceeds physical memory, Page replacement policy deals
with replacing (removing) pages from memory to free frames for bringing in the new pages.
While user process is executing, a page fault occurs. The operating system determines where
the desired page is residing on the disk, and this finds that there are no free frames on the
free frame list as shown in below figure.
If frame is not free, we find one that is not currently being used and free it. We can
free a frame by writing its contents to swap space and changing the page table to
indicate that the page is no longer in memory as shown in below figure. We can
now use the freed frame to hold the page for which the process faulted. The page-
fault service routine is modified as follows to include page replacement,
If no frames are free, the two page transfers (one out and one in) are required. This
will doubles the page-fault service time and increases the effective access time.
This overhead can be reduced by using modify (dirty) bit. Each page or frame may have
modify (dirty) bit associated with it. The modify bit for a page is set by the hardware
whenever any word or byte in the page is written into, indicating that the page has been
modified.
When we select the page for replacement, we check the modify bit. If the bit is set, then
the page is modified and we must write the page to the disk.
If the bit is not set then the page has not been modified. Therefore, we can avoid writing
the memory page to the disk as it is already there.
We must solve two major problems to implement demand paging i.e., we must develop
a frame allocation algorithm and a page replacement algorithm. If we have multiple
processes in memory, we must decide how many frames to allocate to each process and
when page replacement is needed. We must select the frames that are to be replaced.
There are many different page-replacement algorithms. We want the one with the
lowest page-fault rate.
An algorithm is evaluated by running it on a particular string of memory references called
a reference string and computing the number of page faults.
The first three references (7,0,1) causes page faults and are brought into the empty
frames.
The next reference 2 replaces page 7 because the page 7 was brought in first.
Since 0 is the next reference and 0 is already in memory we have no page fault for this
reference.
The next reference 3 replaces page 0 so but the next reference to 0 causer page fault.
Page 1 is then replaced by page 0.
This will continue till the end of string as shown in below figure and there are 15 faults
all together.
Figure 9.12 FIFO page-replacement algorithm.
For some page replacement algorithm, the page fault may increase as the number of
allocated frames increases. This is called as Belady’s Anamoly. FIFO replacement
algorithm may face this problem.
To illustrate Belady’s Anamoly with a FIFO page-replacement algorithm, consider
the following reference string.
1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
The below figure shows the curve of page faults for this reference string versus
the number of available frames. The number of faults for four frames (ten) is greater
than the number of faults for three frames (nine).
The first three references cause faults that fill the three empty frames.
The references to page 2 replaces page 7, because 7 will not be used until reference 18.
The page 0 will be used at 5 and page 1 at 14. This will continue till the end of the string
as shown in figure.
LRU replacement associates with each page the time of that page's last use.
When a page must be replaced LRU chooses the page that has not been used
for the longest period of time.
The result of applying LRU replacement to our example reference string is
shown in below figure.
Figure 9.15 LRU page-replacement algorithm
The first 5 faults are similar to optimal replacement.
When reference to page 4 occurs, LRU sees that page 2 is used least recently. The most
recently used page is page 0 and just before page 3 was used.
The LRU policy is often used as a page replacement algorithm and considered to be
good.
Two implementations are possible,
o Counters: In this we associate each page table entry a time-of-use field, and
add to the CPU a logical clock or counter. The clock is incremented for each
memory reference. When a reference to a page is made, the contents of the clock
register are copied to the time-of-use field in the page table entry for that page.
In this way we have the time of last reference to each page and we replace the
page with smallest time value. The time must also be maintained when page
tables are changed.
Figure 9.16 Use of a stack to record the most recent page references
Additional-Reference-Bits Algorithm
Second-Chance Algorithm
This causes the page with the smallest count to be replaced. The reason for
this selection is that actively used page should have a large referencecount.
This algorithm suffers from the situation in which a page is used heavily
during the initial phase of a process but never used again. Since it wasused
heavily, it has a large count and remains in memory even though it is no
longer needed.
This is based on the principle that the page with the smallest count was
probably just brought in and has yet to be used.
Page-Buffering Algorithms
Systems keep a pool of free frames and when a page fault occurs, a victim frame
is chosen as before. The desired page is read into a free frame from the pool
before the victim is written out. This procedure allows the process to restart as soon
as possible, without waiting for the victim page to be written out. When the victim
is later written out, its frame is added to the free-frame pool.
An expansion of this idea is to maintain a list of modified pages. Whenever the
paging device is idle, a modified page is selected and is written to the disk. Its
modify bit is then reset. This scheme increases the probability that a page will be
clean when it is selected for replacement and will not need to be written out.
Another modification is to keep a pool of free frames but to remember which page
was in each frame. Since the frame contents are not modified when a frame is
written to the disk, the old page can be reused directly from the free-frame pool if
it is needed before that frame is reused. No I/O is needed in this case. When a page
fault occurs, we first check whether the desired page is in the free-frame pool. If it
is not, we must select a free frame and read into it.
Applications and Page Replacement
Applications accessing data through the operating system's virtual memory perform
worse than if the operating system provided no buffering at all. An example is a
database, which provides its own memory management and I/0 buffering.
Applications like this understand their memory use and disk use better than an
operating system that is implementing algorithms for general-purpose use.
In another example, data warehouses frequently perform massive sequential disk
reads, followed by computations and writes. The LRU algorithm would be
removing old pages and preserving new ones, while the application would more
likely be reading older pages than newer ones. Here, MFU would be more efficient
than LRU.
Because of such problems, some operating systems give special programs the
ability to use a disk partition as a large sequential array of logical blocks, without
any file-system data structures. This array is sometimes called the raw disk, and
I/O to this array is termed raw I/0.
9.5Allocation of Frames
It is concerned with how we allocate the fixed amount of free memory among the
various processes.
The simplest case, Consider a single-user system with 128 KB of memory composed
of pages 1 KB in size. This system has 128 frames. The operating system may take 35
KB, leaving 93 frames for the user process. When the free-frame list exhaust, a page-
replacement algorithm would be used to select one of the 93 in-memory pages to be
replaced with the 94th, and so on. When the process terminates the 93 frames would
once again be placed on the free-frame list.
The strategies for the allocation of frames are constrained in various ways. We
cannot allocate more than the total number of available frames. We must also
allocate at least a minimum number of frames.
One reason for allocating at least a minimum number of frames involves
performance. As the number of frames allocated to each process decreases, the
page-fault rate increases, slowing process execution.
Whereas the minimum number of frames per process is defined by the architecture,
the maximum number is defined by the amount of available physical memory.
Allocation Algorithms
The easiest way to split m frames among n processes is to give everyone an equal
share i.e., m/n frames. For example, if there are 93 frames and five processes,
each process will get 18 frames. The three leftover frames can be used as a free-
frame buffer pool. This scheme is called equal allocation.
An alternative is to recognize that various processes will need differing amounts
of memory. Consider a system with a 1-KB frame size. If a small student process
of 10 KB and an interactive database of 127 KB are the only two processes running
in a system with 62 free frames, it does not make much sense to give
each process 31 frames. The student process does not need more than 10 frames,
so the other 21 are wasted.
To solve this problem, we can use proportional allocation in which we allocate
available memory to each process according to its size. Let the size of the virtual
memory for process pi be si , and define
S=∑ si
Then, if the total number of available frames is m, we allocate ai frames to
process pi, where ai is approximately
ai = si /S * m.
We must adjust each ai to be an integer that is greater than the minimum number of
frames required, with a sum not exceeding m.
With proportional allocation, we would split 62 frames between two processes,
one of 10 pages and one of 127 pages, by allocating 4 frames and 57 frames,
respectively, since
10/137 x 62 ~ 4, and
127/137 X 62 ~57.
In this way, both processes share the available frames according to their "needs"
rather than equally.
We may want to give the high-priority process more memory low-priority
processes to speed its execution. One solution is to use a proportional allocation
scheme wherein the ratio of frames depends not on the relative sizes of processes
but rather on the priorities of processes or on a combination of size and priority.
Another important factor in the way frames are allocated to the various processes
is page replacement. With multiple processes competing for frames, we can classify
page-replacement algorithms into two broad categories, Global and local
replacement. The difference between them are,
Allows process to replace the frames from Each process selects a replacement only
the set of all frames, even if that frame is from its own set of allocate frames.
allocated to other process.
Number of frames will change. Number of frame does not change.
Increases system throughput Less system throughput
Process cannot control its own page fault Process can control its own page fault
rate rate
Commonly used Rarely used
9.6Thrashing
Cause of Thrashing
10.1File Concept
Computers can store information on various storage media.
Operating system provides a uniform logical view of information storage. This
logical storage unit is called as a file.
Files are mapped by operating system onto physical devices. These storage devices
are nonvolatile, so contents are persistent through power failures and system reboots.
File is a named collection of related information that is recorded on secondary
storage.
Files represent both the program and the data. Data can be numeric, alphanumeric,
alphabetic or binary.
Many different types of information like source programs, object programs, executable
programs, numeric data, payroll recorder, graphic images, and sound recordings and so on
can be stored on a file.
A file has a certain defined structures according to its type.
o Text file: Text file is a sequence of characters organized in to lines.
o Object file: Object file is a sequence of bytes organized in to blocks
understandable by the systems linker.
o Executable file: Executable file is a series of code section that the loader can
bring in to memory and execute.
o Source File: Source file is a sequence of subroutine and function, each of which
are further organized as declaration followed by executable statements.
File Attributes
File attributes varies from one OS to other. The common file attributes are,
o Name: The symbolic file name is the only information kept in human
readable form.
o Identifier: The unique tag, usually a number, identifies the file within the file
system. It is the non-readable name for a file.
o Type: This information is needed for systems that support different types.
o Location: This information is a pointer to a device and to the location of the
file on that device.
o Size: The current size of the file and possibly the maximum allowed size are
included in this attribute.
o Protection: Access control information determines who can do reading,
writing, execute and so on.
o Time, data and User Identification: This information must be kept for
creation, last modification and last use. These data are useful for protection,
security and usage monitoring.
File Operations
File is an abstract data type. To define a file we need to consider the operation
that can be performed on the file.
Basic operations of files are,
o Creating a file: Two steps are necessary to create a file. First, space in the file system
for file is found. Second, an entry for the new file must be made in the directory.
o Writing a file: System call is mainly used for writing in to the file. System call specifies
the name, of the file and the information to be written to the file. Given the name
the system search the entire directory for the file. The system must keep a write
pointer to the location in the file where the next write to be taken place.
o Reading a file: To read a file, system call is used. It requires the name of the file and
the memory address from where the next block of the file should be put. Again, the
directory is searched for the associated directory and system must maintain a read
pointer to the location in the file where next read is to take place.
o Delete a file: System will search the directory for the file to be deleted. If entry is
found it releases all free space. That free space can be reused by another file.
o Truncating the file: User may want to erase the contents of the file but keep its
attributes. Rather than forcing the user to delete a file and then recreate it, truncation
allows all attributes to remain unchanged except for file length.
o Repositioning within a file: The directory is searched for appropriate entry and the
current file position is set to a given value. Repositioning within a file does not need
to involve actual I/O. This file operation is also known as seek.
In addition to this basis 6 operations the other two operations include appending
new information to the end of the file and renaming the existing file.
Most of the file operation involves searching the entire directory for the entry
associated with the file. To avoid this, OS keeps a small table called the open-file
table containing information about all open files. When a file operation is
requested, the file is specified via index into this table. So searching is notrequired.
Several piece of information are associated with an open file,
o File pointer: on systems that does not include offset as part of the read and
write system calls, the system must track the last read-write location as current
file position pointer. This pointer is unique to each process operating on a file.
o File open count: It keeps the count of open files. As the files are closed, the
OS must reuse its open file table entries, or it could run out of space in the
table. Because multiple processes may open a file, the system must wait for
the last file to close before removing the open file table entry. The counter tracks
the number of copies of open and closes and reaches zero to last close.
o Disk location of the file: The information needed to locate the file on the disk
is kept in memory to avoid having to read it from the disk for each operation.
o Access rights: Each process opens a file in an access mode. This information
is stored on per-process table, then OS can allow or deny subsequent I/O
request.
Some operating systems provide facilities for locking an open file (or sections of
a file). File locks allow one process to lock a file and prevent other processes from
gaining access to it.
File locks are useful for files that are shared by several processes.
A shared lock is similar to a reader lock in that several processes can acquire the
lock concurrently. An exclusive lock behaves like a writer lock which means only
one process at a time can acquire such a lock.
Operating systems provide both types of locks, but some systems provide only
exclusive file locking.
Operating systems may also provide either mandatory or advisory file-locking
mechanisms. If a lock is mandatory, then once a process acquires an exclusive lock,
the operating system will prevent any other process from accessing the locked file.
For advisory locking, it is up to software developers to ensure that locks are
appropriately acquired and released.
Windows operating systems adopt mandatory locking, and UNIX systems employ
advisory locks.
File Types
File type is included as a part of the filename. File name is split into two parts- a
name and extension separated by a period character.
The system uses the extension to indicate the type of the file and the type of
operations that can be done on that file.
For example, only a file with a .com, .exe, or .bat extension can be executed. The
.com and .exe files are two forms of binary executable files, whereas .bat file is a
batch file containing commands in ASCII format to the operating system.
The table shown in below figure gives the file type with extension and function.
Figure 10.2 Common file types.
File Structure
File types can be used to indicate the internal structure of the file. Certain files must
match to a required structure that is understood by the operating system.
For example, the operating system requires that an executable file have a specific
structure so that it can determine where in memory to load the file and what the
location of the first instruction is.
Some operating systems extend this idea into a set of system-supported file
structures, with sets of special operations for manipulating files with those
structures. This is one disadvantage where the resulting size of the operating system
is cumbersome.
Some operating systems impose and support a minimal number of file structures.
Too few structures make programming inconvenient, whereas too many cause
operating-system to expand and makes programmer to confuse.
Internal File Structure
Locating an offset within a file can be complicated for the operating system.
Disk systems have a well-defined block size determined by the size of a sector.
All disk I/0 is performed in units of one block and all blocks are the same size. It
is not sure that the physical record size will exactly match the length of the desired
logical record. Logical records may even vary in length.
Packing a number of logical records into physical blocks is a common solution to
this problem. The logical record size, physical block size, and packing technique
determine how many logical records are in each physical block. The packing can
be done either by the user's application program or by the operating system. In
either case, the file may be considered a sequence of blocks.
All the basic I/O functions operate in terms of blocks. The conversion from logical
records to physical blocks is a relatively simple software problem.
Because disk space is always allocated in blocks, some portion of the last block of
each file is wasted. The waste incurred to keep everything in units of blocks is
internal fragmentation.
All file systems suffer from internal fragmentation; the larger the block size, the
greater the internal fragmentation.
10.2Access Methods
The information in the file can be accessed in several ways. The different file access
methods are,
Sequential Access
Sequential access is the simplest access method. Information in the file is processed
in order, one record after another. Editors and compilers access the files in this
fashion.
A read operation read next reads the next portion of the file and automatically
advances a file pointer, which tracks next I/O location. The write operation write
next appends to the end of the file and advances to the end of the newly written
material.
Such a file can be reset to the beginning and on some systems a program may be
able to skip forward or backward n records for some integer n, where n = 1.
Sequential access, which is depicted in below figure is based on a tape model of
a file and works well on sequential-access devices as it does on random-access ones.
A file is made up of fixed length logical records. It allows the program to read and
write records rapidly in any order.
Direct access allows random access to any file block. This method is based on
disk model of a file.
For direct access, the file is viewed as a numbered sequence of blocks or records.
Thus, we may read block 14, then read block 53, and then write block 7.
The direct access method is suitable for searching the records in large amount of
information. For example, on an airline-reservation system, we might store all
the information about a particular flight (for example, flight 713) in the block
identified by the flight number. Thus, the number of available seats for flight 713
is stored in block 713 of the reservation file.
The file operations must be modified to include the block number as a parameter.
Thus, we have read n, where n is the block number, rather than read next, and write
n rather than write next.
The block number provided by the user to the operating system is normally a
relative block number. A relative block number is an index to the beginning of
the file.
We can easily simulate sequential access on a direct-access file by simply keeping
a variable cp that defines current position, as shown in below figure, where as
simulating a direct-access file on a sequential-access file is extremely inefficient
and clumsy.
These methods generally involve the construction of an index for the file.
The index is like an index at the end of a book which contains pointers to various
blocks.
To find a record in a file, we search the index and then use the pointer to access the file
directly and to find the desired record.
With large files index file itself can be very large to be kept in memory. One
solution is to create an index for the index files itself. The primary index file would
contain pointer to secondary index files which would point to the actual data items.
For example, IBM's indexed sequential-access method (ISAM) uses a small
master index that points to disk blocks of a secondary index. The secondary index
blocks point to the actual file blocks. The file is kept sorted on a defined key. To
find a particular item, we first make a binary search of the master index, which
provides the block number of the secondary index. This block is read in, and again
a binary search is used to find the block containing the desired record. Finally, this
block is searched sequentially. In this way, any record can be located from its key
by at most two direct-access reads. The below figure shows a similarsituation as
implemented by VMS index and relative files.
10.3Directory Structure
The files systems can be very large. Some systems stores millions of files on the disk.
To manage all this data we need to organize them. This organization involves the use
of directories.
Storage structure
A disk can be used completely for a file system. Sometimes it is desirable to place
multiple file systems on a disk or to use parts of the disk for a file system. These
parts are known as partitions, slices or minidisks.
These parts can be combined to form larger structures known as volumes.
Each volume contains information about files within it. This information is kept in
entries in a device directory (directory) or volume table of contents.
The directory records the information such as name, location, size, type for all files
on that volume.
The below figure shows a typical file-system organization.
Directory Overview
The directory can be viewed as a symbol table that translates the file names into
their directory entries. The directory itself can be organized in many ways.
When considering a particular directory structure, several operations can be
performed on a directory.
o Search for a file: Directory structure is searched for finding particular file in
the directory. Files have symbolic names and similar names may indicate a
relationship between files, we must be able to find all the files whose name
matches a particular pattern.
o Create a file: New files can be created and added to the directory.
o Delete a file: when a file is no longer needed, we can remove it from the
directory.
o List a directory: We need to be able to list the files in directory and the contents
of the directory entry for each file in the list.
o Rename a file: Name of the file must be changeable, when the contents of the
file are changed. Renaming allows the position within the directory structure
to be changed.
o Traverse the file system: It should be possible to access any file in the file
system. It is always good to keep the backup copy of the file so that it can be
used when the system fails or when the file system is not in use.
There are different types of logical structures of a directory as discussed below.
Single-level directory
This is the simplest directory structure. All the files are contained in the same
directory which is easy to support and understand as shown in below figure
Figure 10.7 Single-level directory
The disadvantages are,
o Not suitable for a large number of files and more than one user.
o Because of single directory files, files require unique file names.
o Difficult to remember names of all the files as the number of files increases.
Two-level directory
A single level directory often leads to the confusion of file names betweendifferent
users. The solution here is to create separate directory for each user.
In two level directories each user has its own directory. It is called User File
Directory (UFD). Each UFD has a similar structure, but lists only the files of a
single user.
When a user job starts or users logs in, the system’s Master File Directory (MFD)
is searched. The MFD is indexed by the user name or account number and each
entry points to the UFD for that user as shown in below figure
Figure 10.8 Two-level directory structure.
When a user refers to a particular file, only his own UFD is searched. Thus different
users may have files with the same name.
To create a file for a user, OS searches only that user UFD to check whether another
file of that name exists.
To delete a file OS checks in the local UFD, so that it cannot accidentally delete
another user’s file with the same name.
Although two-level directories solve the name collision problem but it still has
some disadvantages.
This structure isolates one user from another and this isolation is an advantage when
the users are independent, but disadvantage when some users want to co- operate
on some task and to access one another’s file.
If access is to be permitted, one user must have the ability to name a file in another
user's directory. To name a particular file we must give both the user name and the
file name.
A two-level directory is like a tree, or an inverted tree of height 2.
The root of the tree is the MFD. Its direct descendants are the UFDs. The
descendants of the UFDs are the files. The files are the leaves of the tree. A user
name and a file name define a path name.
To access the system files the appropriate commands are given to the operating
system and these files are read by the loader and executed. This file name would be
searched in the current UFD. The sequence of directories searched when a file is
named is called the search path.
Tree-structured directories
The two level directory structures can be extended to a tree of arbitrary height as
shown in below figure.
The problem with using an acyclic-graph structure is ensuring that there are no
cycles. When we add links, the tree structure is destroyed, resulting in a simple
graph structure as shown in below figure.
The advantage of an acyclic graph is the simplicity of the algorithms to traverse
the graph and to determine when there are no more references to a file.
The file system must be mounted before it can be available to processes on the system
The procedure for mounting the file is as follows:
o The OS is given the name of the device and the location within the file structure at which
to attach the file system (mount point).A mount point will be an empty directory at which
the mounted file system will be attached. For example: On UNIX, a file system containing
user’s home directory might be mounted as /home then to access the
directory structure within that file system we must precede the directory names as
/home/Jane.
o Then OS verifies that the device contains this valid file system. OS uses device drivers for
this verification.
o Finally the OS mounts the file system at the specified mount point.
Consider the file system depicted in figure below, where the triangles represent
subtrees of directories. Figure (a) shows an existing file system, while figure (b) shows
an unmounted volume residing on /device/dsk. At this point, only the files on the
existing file system can be accessed.
Figure 10.12 File system, (a) Existing system, (b) Unmounted volume.
The below Figure shows the effects of mounting the volume residing on /device/dsk
over /users.
10.5File Sharing
Multiple Users
Given a directory structure that allows files to be shared by users, the operating
system must mediate the file sharing.
The system either can allow a user to access the files of other users by default or it
may require that a user specifically grant access to the files.
To implement sharing and protection, the system maintain more file and directory
attributes than on a single user system, most systems support the concept of file
owner and group.
When a user requests an operation on a file, the user ID can be compared to the
owner attribute to determine if the requesting user is the owner of the file.Likewise
the group ID’s can be compared. The result indicates which permissions are
applicable.
Client-server Model
System containing the files is the server and the system requesting access
to the files is a client. Files are specified on a partition or subdirectory level.
A server can serve multiple clients and a client can use multiple servers.
A client can be specified network name or other identifier, such as an IP
address, but these can be spoofed or imitated. As a result of spoofing, an
unauthorized client could be allowed to access the server. More secure
solutions include secure authentication of the client by using encrypted
keys.
Distributed Information systems
Failure Modes
Local file systems can fail for a variety of reasons, including failure of the
disk containing the file system, corruption of the directory structure or other
disk-management information (metadata) disk-controller failure, cable
failure, and host-adapter failure.
User or system-administrator failure can also cause files to be lost or entire
directories or volumes to be deleted. Many of these failures will cause a host
to crash and an error condition to be displayed, and human intervention will
be required to repair the damage.
Remote file systems have even more failure modes. Because of the
complexity of network systems and the required interactions between
remote machines, many more problems can interfere with the proper
operation of remote file systems.
In the case of networks, the network can be interrupted between two hosts.
Such interruptions can result from hardware failure, poor hardware
configuration, or networking implementation issues.
To recover from failure, some kind of state information may be maintained
on both the client and the server.
The Networked File System (NFS) takes a simple approach, implementing
a stateless Distributed File System (DFS). It assumes thata client request
for a file read or write would not have occurred unless the file system had
been remotely mounted and the file had been previously open. The NFS
protocol carries all the information needed to locate the appropriate file and
perform the requested operation.
Consistency Semantics
UNIX Semantics
Session Semantics
The Andrew file system (AFS) uses the following consistency semantics,
o Writes to an open file by a user are not visible immediately to other users
that have the same file already opened.
o Once a file is closed, the changes made to it are visible only in sessions
starting later. Already opened instances of the file do not reflect these
changes.
According to these semantics, a file may be associated temporarily with
several images at the same time. Multiple users are allowed to perform both
read and write accesses concurrently on their images of the file, without
delay.
Immutable-Shared-Files Semantics
10.6Protection
Information must be protected from a physical damage and improper access i.e.,
reliability and protection.
Protection can be provided in many ways. For a small single-user system, we might
provide protection by physically removing the external storage devices and locking
them in a desk drawer or file cabinet.
In a multiuser system other mechanisms are needed.
Types of Access
Disks provide bulk of secondary storage on which the file system is maintained. Disks have
two characteristics:
o They can be rewritten in place i.e., it is possible to read a block from the disk to
modify the block and to write back in to same place.
o They can access any given block of information on the disk. Thus it is simple to
access any file either sequentially or randomly and switching from one file to
another.
To provide efficient and convenient access to the disks, OS imposes one or more file
system to allow the data to be stored, located and retrieved easily.
A file system poses two different design problems. The first problem is defining how
the file system should look to the user. This task involves defining a file and its attributes,
the operations allowed on a file and the directory structure for organizing files. The second
problem is creating algorithms and data structures to map the logical file system onto the
physical secondary-storage devices.
File system itself is composed of many different levels. Layered design is shown in
below figure.
Each level in the design uses the features of lower levels to create new features for use by
higher levels.
The lowest level, the I/O control, consists of device drivers and interrupts handlers to
transfer information between the main memory and the disk system.
The basic file system issue generic commands to the appropriate device driver to read and
write physical blocks on the disk.
The file-organization module knows about files and their logical blocks, as well as
physical blocks. It also includes free-space manager.
Logical file system manages metadata information. Metadata includes all of the file-
system structure except actual data.
File Control Block (FCB) contains information about the file including the ownership
permission and location of the file contents. Most operating systems supports more than
one type of file system.
Figure 11.1 Layered file system.
Overview
File system is implemented on the disk and the memory.
The implementation of the file system varies according to the OS and the file system,but
there are some general principles.
UNIX supports UNIX File System (UFS), Windows supports File Allocation Table (FAT), FAT-
32, Network Transfer File System (NTFS) and Linux supports more than 40 file systems.
If the file system is implemented on the disk it contains the following information:
o Boot Control Block can contain information needed by the system to boot an
OS from that partition. If the disk has no OS, this block is empty. It is the first
block of the partition. In UFS, it is called boot block and in NTFS it is partition
boot sector.
o Partition control Block contains volume or partition details such as thenumber
of blocks in partition, size of the blocks, and number of free blocks, free block
pointer, free FCB count and FCB pointers. In NTFS it is stored in master file
tables, In UFS this is called super block.
o Directory structure is used to organize the files. In UFS, this includes file names
and associated inode numbers. In NTFS, it is stored in the master file table.
o An FCB contains many of the files details, including file permissions,
ownership, size, location of the data blocks. In UFS this is called inode, In
NTFS this information is actually stored within master file table.
The In-memory information is used for both file-system management and
performance improvement via caching. The data are loaded at mount time and
discarded at dismount. The structures includes,
o An in-memory mount table containing information about each mounted
information.
o An in-memory directory structure that holds the directory information of
recently accessed directories.
o The system wide open file table contains a copy of the FCB of each open file
as well as other information.
o The per-process open file table contains a pointer to the appropriate entry in
the system wide open file table as well as other information.
To create a new file, an application program calls the logical file system. The logical
file system knows the format of the directory structures. To create a new file, it
allocates a new FCB. The system then reads the appropriate directory into memory,
updates it with the new file name and FCB, and writes it back to the disk. A typical
FCB is shown in below figure
File must be opened before using it for I/O. The open() call passes a file name to
the file system. The open() system call searches the system-wide open-file table to
find the file name given by the user. If it is opened, a per-process open-file table
entry is created pointing to the existing system-wide open-file table.
When a file is opened, the directory structure is searched for the given file name
The open() call returns a pointer to the appropriate entry in the per-process file-
system table. All file operations are performed via this pointer.
The name given to the entry varies. UNIX systems refer to it as a file descriptor,
Windows refers to it as a file handle.
When a process closes the file, the per-process table entry is removed, and
system-wide entry’s open count is decremented.
The below figure illustrates the necessary file system structures provided by the
operating systems.
Figure 11.3 In-memory file-system structures, (a) File open, (b) File read.
A disk can be divided in to multiple partitions. Each partition can be either raw
i.e., containing no file system or cooked i.e., containing a file system.
Raw disk is used where no file system is appropriate. UNIX swap space can use a
raw partition and do not use file system.
Some databases uses raw disk and format the data to suit their needs. Raw disks
can hold information need by disk RAID (Redundant Array of Independent Disks)
system.
Boot information can be stored in a separate partition. Boot information will have
their own format. At the booting time, system does not load any device driver for
the file system. Boot information is a sequential series of blocks, loaded as an image
in to memory.
Dual booting is also possible on some PCs where more than one OS are loaded on
a system.
A boot loader understands multiple file systems. Multiple OS can occupy the boot
space once loaded and it can boot one of the OS available on the disk. The disks
can have multiple partitions each containing different types of file system and
different types of OS.
Boot partition contains the OS kernel and is mounted at a boot time.
The operating system notes in its in-memory mount table that a file system is
mounted, along with the type of the file system. Microsoft window based systems
mount each partition in a separate name space denoted by a letter and a colon. On
UNIX, file system can be mounted at any directory.
Virtual File Systems
The four main object types defined by the Linux VFS are:
o The inode object, which represents an individual file.
o The file object, which represents an open file.
o The superblock object, which represents an entire file system.
o The dentry object, which represents an individual directory entry.
11.3Directory Implementation
Directory is implemented in two ways,
Linear list
Linear list is a simplest method.
It uses a linear list of file names with pointers to the data blocks. It uses a linear
search to find a particular entry as shown in below figure.
It is simple for programming but time consuming to execute.
To create a new file, we must first search the directory to be sure that no existing
file has the same name. Then, we add a new entry at the end of the directory. To
delete a file, we search the directory for the named file and then release the space
allocated to it.
To reuse the directory entry, we can do one of several things. We can mark the
entry as unused or we can attach it to a list of free directory entries. A third
alternative is to copy the last entry in the directory into the freed location and to
decrease the length of the directory.
A linked list can also be used to decrease the time required to delete a file. Linear
search is the main disadvantage.
An advantage of the sorted list is that a sorted directory listing can be produced
without a separate sort step.
Hash table
Contiguous Allocation
A single set of blocks is allocated to a file at the time of file creation. This is a
pre-allocation strategy that uses portion of variable size.
The file allocation table needs single entry for each file, showing the starting
block and the length of the file.
The below figure shows the contiguous allocation method.
Linked Allocation
It solves the problem of contiguous allocation. This allocation is on the basis of
an individual block. Each block contains a pointer to the next block in the chain.
The disk block can be scattered anywhere on the disk.
The directory contains a pointer to the first and the last blocks of the file.
The below figure shows the linked allocation. To create a new file, simply create
a new entry in the directory.
Indexed Allocation
The file allocation table contains a separate one level index for each file. The
index has one entry for each portion allocated to the file.
The ith entry in the index block points to the ith block of the file. The below figure
shows indexed allocation.
Figure 11.8 Indexed allocation of disk space.
The indexes are not stored as a part of file allocation table but they are kept as a
separate block and the entry in the file allocation table points to that block.
Allocation can be made on either fixed size blocks or variable size blocks. When the
file is created all pointers in the index block are set to nil. When an entry is made a
block is obtained from free space manager.
Indexed allocation supports both direct access and sequential access to the file. It
supports direct access, without suffering from external fragmentation, because any
free block on the disk can satisfy a request for more space. Indexed allocation also
suffer from wasted space.
The pointer overhead of the index block is generally greater than the pointer
overhead of linked allocation.
Every file must have an index block, so we want the index block to be as small as
possible. If the index block is too small it will not be able to hold enough pointers
for a large file, and a mechanism will have to be available to deal with this issue.
o Multilevel index
A variant of linked representation uses a first-level index block to point to
a set of second-level index blocks, which in turn point to the file blocks. To
access a block, the operating system uses the first-level index to find a
second-level index block and then uses that block to find the desired data
block. This approach could be continued to a third or fourth level,
depending on the desired maximum file size.
o Combined scheme
Another alternative, used in the UFS, is to keep the First 15 pointers of the
index block in the file's inode. The first 12 of these pointers point to direct
blocks; that is, they contain addresses of blocks that contain data of the file.
Thus, the data for small block do not need a separate index block.
The next three pointers point to indirect blocks. The first points to a single
indirect block, which is an index block containing not data but the addresses
of blocks that do contain data. The second points to a double indirect block,
which contains the address of a block that contains the addresses of blocks
that contain pointers to the actual data blocks. The last pointer contains the
address of a triple indirect block.
In this method, the number of blocks that can be allocated to a file exceeds
the amount of space addressable by the four-byte file pointers used by many
operating systems.
Performance
The allocation methods vary in their storage efficiency and data-block access times.
Both are important criteria in selecting the proper methods for an operating system
to implement.
For any type of access, contiguous allocation requires only one access to get a disk
block. Since we can easily keep the initial address of the file in memory, we can
calculate immediately the disk address of the ith block and read it directly.
For linked allocation, we can also keep the address of the next block in memory
and read it directly. This method is fine for sequential access, but for direct
access, an access to the ith block might require i disk reads.
The operating system must have appropriate data structures and algorithms to
support both allocation methods.
Files can be converted from one type to another by the creation of a new file of
the desired type, into which the contents of the old file are copied. The old file may
then be deleted and the new file renamed.
Indexed allocation is more complex. If the index block is already in memory, then
the access can be made directly. But keeping the index block in memory requires
considerable space. If this memory space is not available, then we may have to read
first the index block and then the desired data block. For a two-level index, two
index-block reads might be necessary. For an extremely large file, accessing a block
near the end of the file would require reading in all the index blocks before the
needed data block finally could be read. Thus, the performance of indexed
allocation depends on the index structure, on the size of the file,and on the
position of the block desired.
Some systems combine contiguous allocation with indexed allocation by using
contiguous allocation for small files and automatically switching to an indexed
allocation if the file grows large. Since most files are small, and contiguous
allocation is efficient for small files, average performance can be quite good.
Since disk space is limited, we need to reuse the space from deleted files for new Files.
To keep track of free disk space, the system maintains a free-space list.
The free-space list records all free disk blocks-those not allocated to some file or
directory. To create a file, we search the free-space list for the required amount of
space and allocate that space to the new file. This space is then removed from the free-
space list. When a file is deleted, its disk space is added to the free-space list.
There are different methods to manage free space.
Bit Vector
Linked List
Another approach to free-space management is to link together all the free disk
blocks, keeping a pointer to the first free block in a special location on the disk and
caching it in memory.
This first block contains a pointer to the next free disk block, and so on.
This scheme is not efficient because to traverse the list, we must read each block,
which requires substantial I/0 time.
The operating system simply needs a free block so that it can allocate that block
to a file, so the first block in the free list is used.
Grouping
A modification of the free-list approach stores the addresses of n free blocks in
the first free block. The first n-1 of these blocks are actually free. The last block
contains the addresses of other n free blocks, and so on.
The addresses of a large number of free blocks can be found quickly, unlike the
situation when the linked-list approach is used.
Counting