Os Unit-4 (Bca)
Os Unit-4 (Bca)
V UNIT
BCA
Type
Attributes s Operations
Author C Append
Close
File type Usual extension Function
Word
Processor wp, tex, rrf, doc Various word processor formats
Archive arc, zip, tar Related files grouped into one compressed file
FILE DIRECTORIES:
Collection of files is a file directory. The directory contains information about the files, including
attributes, location and ownership. Much of this information, especially that is concerned with
storage, is managed by the operating system. The directory is itself a file, accessible by various
file management routines.
Information contained in a device directory are:
Name
Type
Address
Current length
Maximum length
Date last accessed
Date last updated
Owner id
Protection information
Operation performed on directory are:
Search for a file
Create a file
Delete a file
List a directory
Rename a file
Traverse the file system
Advantages of maintaining directories are:
Efficiency: A file can be located more quickly.
Naming: It becomes convenient for users as two users can have same name for different files
or may have different name for same file.
Grouping: Logical grouping of files can be done by properties e.g. all java programs, all
games etc.
SINGLE-LEVEL DIRECTORY
In this a single directory is maintained for all the users.
Naming problem: Users cannot have same name for two files.
Grouping problem: Users cannot group files according to their need.
TWO-LEVEL DIRECTORY
In this separate directories for each user is maintained.
Path name:Due to two levels there is a path name for every file to locate that file.
Now,we can have same file name for different user.
Searching is efficient in this method.
TREE-STRUCTURED DIRECTORY :
Directory is maintained in the form of a tree. Searching is efficient and also there is grouping
capability. We have absolute or relative path name for a file.
FILE ALLOCATION METHODS :
1. Continuous Allocation –
A single continuous set of blocks is allocated to a file at the time of file creation. Thus, this is a
pre-allocation strategy, using variable size portions. The file allocation table needs just a single
entry for each file, showing the starting block and the length of the file. This method is best from
the point of view of the individual sequential file. Multiple blocks can be read in at a time to
improve I/O performance for sequential processing. It is also easy to retrieve a single block. For
example, if a file starts at block b, and the ith block of the file is wanted, its location on
secondary storage is simply b+i-1.
Disadvantage –
External fragmentation will occur, making it difficult to find contiguous blocks of space of
sufficient length. Compaction algorithm will be necessary to free up additional space on disk.
Also, with pre-allocation, it is necessary to declare the size of the file at the time of creation.
2. Linked Allocation(Non-contiguous allocation) –
Allocation is on an individual block basis. Each block contains a pointer to the next block in the
chain. Again the file table needs just a single entry for each file, showing the starting block and
the length of the file. Although pre-allocation is possible, it is more common simply to allocate
blocks as needed. Any free block can be added to the chain. The blocks need not be continuous.
Increase in file size is always possible if free disk block is available. There is no external
fragmentation because only one block at a time is needed but there can be internal fragmentation
but it exists only in the last disk block of file.
Disadvantage –
Internal fragmentation exists in last disk block of file.
There is an overhead of maintaining the pointer in every disk block.
If the pointer of any disk block is lost, the file will be truncated.
It supports only the sequential access of files.
3. Indexed Allocation –
It addresses many of the problems of contiguous and chained allocation. In this case, the file
allocation table contains a separate one-level index for each file: The index has one entry for
each block allocated to the file. Allocation may be on the basis of fixed-size blocks or variable-
sized blocks. Allocation by blocks eliminates external fragmentation, whereas allocation by
variable-size blocks improves locality. This allocation technique supports both sequential and
direct access to the file and thus is the most popular form of file allocation.
Disk Free Space Management :
Just as the space that is allocated to files must be managed ,so the space that is not currently
allocated to any file must be managed. To perform any of the file allocation techniques,it is
necessary to know what blocks on the disk are available. Thus we need a disk allocation table in
addition to a file allocation table.The following are the approaches used for free space
management.
1. Bit Tables : This method uses a vector containing one bit for each block on the disk. Each
entry for a 0 corresponds to a free block and each 1 corresponds to a block in use.
For example: 00011010111100110001
In this vector every bit correspond to a particular block and 0 implies that, that particular
block is free and 1 implies that the block is already occupied. A bit table has the advantage
that it is relatively easy to find one or a contiguous group of free blocks. Thus, a bit table
works well with any of the file allocation methods. Another advantage is that it is as small as
possible.
2. Free Block List : In this method, each block is assigned a number sequentially and the list of
the numbers of all free blocks is maintained in a reserved block of the disk.
A directory is a container that is used to contain folders and files. It organizes files and folders
in a hierarchical manner.
There are several logical structures of a directory, these are given below.
Single-level directory –
The single-level directory is the simplest directory structure. In it, all files are contained in
the same directory which makes it easy to support and understand.
A single level directory has a significant limitation, however, when the number of files
increases or when the system has more than one user. Since all the files are in the same
directory, they must have a unique name. if two users call their dataset test, then the unique
name rule violated.
Advantages:
Since it is a single directory, so its implementation is very easy.
If the files are smaller in size, searching will become faster.
The operations like file creation, searching, deletion, updating are very easy in such a
directory structure.
Disadvantages:
There may chance of name collision because two files can have the same name.
Searching will become time taking if the directory is large.
This can not group the same type of files together.
Two-level directory –
As we have seen, a single level directory often leads to confusion of files names among
different users. the solution to this problem is to create a separate directory for each user.
In the two-level directory structure, each user has their own user files directory (UFD). The
UFDs have similar structures, but each lists only the files of a single user. system’s master
file directory (MFD) is searches whenever a new user id=s logged in. The MFD is indexed
by username or account number, and each entry points to the UFD for that user.
Advantages:
We can give full path like /User-name/directory-name/.
Different users can have the same directory as well as the file name.
Searching of files becomes easier due to pathname and user-grouping.
Disadvantages:
A user is not allowed to share files with other users.
Still, it not very scalable, two files of the same type cannot be grouped together in the same
user.
Tree-structured directory –
Once we have seen a two-level directory as a tree of height 2, the natural generalization is
to extend the directory structure to a tree of arbitrary height.
This generalization allows the user to create their own subdirectories and to organize their
files accordingly.
A tree structure is the most common directory structure. The tree has a root directory, and
every file in the system has a unique path.
Advantages:
Very general, since full pathname can be given.
Very scalable, the probability of name collision is less.
Searching becomes very easy, we can use both absolute paths as well as relative.
Disadvantages:
Every file does not fit into the hierarchical model, files may be saved into multiple
directories.
We can not share files.
It is inefficient, because accessing a file may go under multiple directories.
Acyclic graph directory –
An acyclic graph is a graph with no cycle and allows us to share subdirectories and files.
The same file or subdirectories may be in two different directories. It is a natural
generalization of the tree-structured directory.
It is used in the situation like when two programmers are working on a joint project and
they need to access files. The associated files are stored in a subdirectory, separating them
from other projects and files of other programmers since they are working on a joint project
so they want the subdirectories to be into their own directories. The common subdirectories
should be shared. So here we use Acyclic directories.
It is the point to note that the shared file is not the same as the copy file. If any programmer
makes some changes in the subdirectory it will reflect in both subdirectories.
Advantages:
We can share files.
Searching is easy due to different-different paths.
Disadvantages:
We share the files via linking, in case deleting it may create the problem,
If the link is a soft link then after deleting the file we left with a dangling pointer.
In the case of a hard link, to delete a file we have to delete all the references associated
with it.
General graph directory structure –
In general graph directory structure, cycles are allowed within a directory structure where
multiple directories can be derived from more than one parent directory.
The main problem with this kind of directory structure is to calculate the total size or space
that has been taken by the files and directories.
Advantages:
It allows cycles.
It is more flexible than other directories structure.
Disadvantages:
It is more costly than others.
It needs garbage collection.
Disk Response Time: Response Time is the average of time spent by a request waiting to
perform its I/O operation. Average Response time is the response time of the all
requests. Variance Response Time is measure of how individual request are serviced with
respect to average response time. So the disk scheduling algorithm that gives minimum
variance response time is better.
Disk Scheduling Algorithms
1. FCFS: FCFS is the simplest of all the Disk Scheduling Algorithms. In FCFS, the requests
are addressed in the order they arrive in the disk queue.Let us understand this with the help of
an example.
Example:
1. Suppose the order of request is- (82,170,43,140,24,16,190)
And current position of Read/Write head is : 50
1.
So, total seek time:
1. =(50-43)+(43-24)+(24-16)+(82-16)+(140-82)+(170-140)+(190-170)
=208
Advantages:
Average Response Time decreases
Throughput increases
Disadvantages:
Overhead to calculate seek time in advance
Can cause Starvation for a request if it has higher seek time as compared to incoming
requests
High variance of response time as SSTF favours only some requests
1. SCAN: In SCAN algorithm the disk arm moves into a particular direction and services the
requests coming in its path and after reaching the end of disk, it reverses its direction and
again services the request arriving in its path. So, this algorithm works as an elevator and
hence also known as elevator algorithm. As a result, the requests at the midrange are
serviced more and those arriving behind the disk arm will have to wait.
Example:
1. Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm
is at 50, and it is also given that the disk arm should move “towards the larger value”.
1.
Therefore, the seek time is calculated as:
1. =(199-50)+(199-16)
=332
Advantages:
High throughput
Low variance of response time
Average response time
Disadvantages:
Long waiting time for requests for locations just visited by disk arm
1. CSCAN: In SCAN algorithm, the disk arm again scans the path that has been scanned, after
reversing its direction. So, it may be possible that too many requests are waiting at the other
end or there may be zero or few requests pending at the scanned area.
These situations are avoided in CSCAN algorithm in which the disk arm instead of reversing its
direction goes to the other end of the disk and starts servicing the requests from there. So, the
disk arm moves in a circular fashion and this algorithm is also similar to SCAN algorithm and
hence it is known as C-SCAN (Circular SCAN).
Example:
Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm is at
50, and it is also given that the disk arm should move “towards the larger value”.
Seek time is calculated as:
=(199-50)+(199-0)+(43-0)
=391
Advantages:
Provides more uniform wait time compared to SCAN
1. LOOK: It is similar to the SCAN disk scheduling algorithm except for the difference that the
disk arm in spite of going to the end of the disk goes only to the last request to be serviced in
front of the head and then reverses its direction from there only. Thus it prevents the extra
delay which occurred due to unnecessary traversal to the end of the disk.
Example:
1. Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm
is at 50, and it is also given that the disk arm should move “towards the larger value”.
1.
So, the seek time is calculated as:
1. =(190-50)+(190-16)
=314
1. CLOOK: As LOOK is similar to SCAN algorithm, in similar way, CLOOK is similar to
CSCAN disk scheduling algorithm. In CLOOK, the disk arm in spite of going to the end goes
only to the last request to be serviced in front of the head and then from there goes to the
other end’s last request. Thus, it also prevents the extra delay which occurred due to
unnecessary traversal to the end of the disk.
Example:
1. Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm
is at 50, and it is also given that the disk arm should move “towards the larger value”
1.
So, the seek time is calculated as:
1. =(190-50)+(190-16)+(43-16)
=341
2. RSS– It stands for random scheduling and just like its name it is nature. It is used in
situations where scheduling involves random attributes such as random processing time,
random due dates, random weights, and stochastic machine breakdowns this algorithm sits
perfect. Which is why it is usually used for and analysis and simulation.
3. LIFO– In LIFO (Last In, First Out) algorithm, newest jobs are serviced before the existing
ones i.e. in order of requests that get serviced the job that is newest or last entered is serviced
first and then the rest in the same order.
Advantages
Maximizes locality and resource utilization
Can seem a little unfair to other requests and if new requests keep coming in, it cause
starvation to the old and existing ones.
4. N-STEP SCAN – It is also known as N-STEP LOOK algorithm. In this a buffer is created
for N requests. All requests belonging to a buffer will be serviced in one go. Also once the
buffer is full no new requests are kept in this buffer and are sent to another one. Now, when
these N requests are serviced, the time comes for another top N requests and this way all get
requests get a guaranteed service
Advantages
It eliminates starvation of requests completely
5. FSCAN– This algorithm uses two sub-queues. During the scan all requests in the first queue
are serviced and the new incoming requests are added to the second queue. All new requests
are kept on halt until the existing requests in the first queue are serviced.
Advantages
FSCAN along with N-Step-SCAN prevents “arm stickiness” (phenomena in I/O
scheduling where the scheduling algorithm continues to service requests at or near the
current sector and thus prevents any seeking)
Each algorithm is unique in its own way. Overall Performance depends on the number and type
of requests.
Note:Average Rotational latency is generally taken as 1/2(Rotational latency).
Disk structures
Disk platters are formatted in a system of concentric circles, or rings, called tracks. Within
each track are sectors, which subdivide the circle into a system of arcs, each formatted to hold
the same amount of data—typically 512 bytes.
There is one read/write head for every surface of the disk. Also, the same track on all surfaces is
known as a cylinder, When talking about movement of the read/write head, the cylinder is a
useful concept, because all the heads (one for each surface), move in and out of the disk
together.
We say that the “read/write head is at cylinder #2", when we mean that the top read/write head is
at track #2 of the top surface, the next head is at track #2 of the next surface, the third head is at
track #2 of the third surface, etc.
The unit of information transfer is the sector (though often whole tracks may be read and written,
depending on the hardware). As far as most file-systems are concerned, though, the sectors are
what matter. In fact, we usually talk about a 'block device'. A block often corresponds to a
sector, though it need not do, several sectors may be aggregated to form a single logical block.
Discuss
The range of services and add-ons provided by modern operating systems is constantly
expanding, and four basic operating system management functions are implemented by all
operating systems. These management functions are briefly described below and given the
following overall context. The four main operating system management functions (each of which
are dealt with in more detail in different places) are:
Process Management
Memory Management
File and Disk Management
I/O System Management
Most computer systems employ secondary storage devices (magnetic disks). It provides low-
cost, non-volatile storage for programs and data (tape, optical media, flash drives, etc.).
Programs and the user data they use are kept on separate storage devices called files. The
operating system is responsible for allocating space for files on secondary storage media as
needed.
There is no guarantee that files will be stored in contiguous locations on physical disk drives,
especially large files. It depends greatly on the amount of space available. When the disc is full,
new files are more likely to be recorded in multiple locations. However, as far as the user is
concerned, the example file provided by the operating system hides the fact that the file is
fragmented into multiple parts.
The operating system needs to track the location of the disk for every part of every file on the
disk. In some cases, this means tracking hundreds of thousands of files and file fragments on a
single physical disk. Additionally, the operating system must be able to locate each file and
perform read and write operations on it whenever it needs to. Therefore, the operating system is
responsible for configuring the file system, ensuring the safety and reliability of reading and
write operations to secondary storage, and maintains access times (the time required to write data
to or read data from secondary storage).
Disk management of the operating system includes:
Disk Format
Booting from disk
Bad block recovery
Divides the disk into sectors before storing data so that the disk controller can read and write
Each sector can be:
The header retains information, data, and error correction code (ECC) sectors of data, typically
512 bytes of data, but optional disks use the operating system’s own data structures to preserve
files using disks.
It is conducted in two stages:
1. Divide the disc into multiple cylinder groups. Each is treated as a logical disk.
2. Logical format or “Create File System”. The OS stores the data structure of the first file
system on the disk. Contains free space and allocated space.
For efficiency, most file systems group blocks into clusters. Disk I / O runs in blocks. File I / O
runs in a cluster.
Boot block:
When the computer is turned on or restarted, the program stored in the initial bootstrap ROM
finds the location of the OS kernel from the disk, loads the kernel into memory, and runs the
OS. start.
To change the bootstrap code, you need to change the ROM and hardware chip. Only a
small bootstrap loader program is stored in ROM instead.
The full bootstrap code is stored in the “boot block” of the disk.
A disk with a boot partition is called a boot disk or system disk.
Bad Blocks:
File sharing is the practice of sharing or offering access to digital information or resources,
including documents, multimedia (audio/video), graphics, computer programs, images and
e-books. It is the private or public distribution of data or resources in a network with different
levels of sharing privileges.
1. MULTIPLE USERS:
When an operating system accommodates multiple users, the issues of file sharing, file naming
and file protection become preeminent.
ü The system either can allow user to access the file of other users by default, or it may
require that a user specifically grant access to the files.
ü To implementing sharing and protection, the system must maintain more file and directory
attributes than a on a single-user system.
ü The owner is the user who may change attributes, grand access, and has the most control
over the file or directory.
ü The group attribute of a file is used to define a subset of users who may share access to the
file.
ü Most systems implement owner attributes by managing a list of user names and associated
user identifiers (user Ids).
ü When a user logs in to the system, the authentication stage determines the appropriate
user ID for the user. That user ID is associated with all of user’s processes and threads.
When they need to be user readable, they are translated, back to the user name via the
user name list.
ü Likewise, group functionality can be implemented as a system wide list of group names
and group identifiers.
ü Every user can be in one or more groups, depending upon operating system design
decisions. The user’s group Ids is also included in every associated process and thread.
· Networking allows the sharing or resource spread within a campus or even around the
world. User manually transfer files between machines via programs like ftp.
· A distributed file system (DFS) in which remote directories is visible from the local
machine.
· The World Wide Web: A browser is needed to gain access to the remote file and
separate operations (essentially a wrapper for ftp) are used to transfer files.
Remote file systems allow a computer to a mount one or more file systems from one
or more remote machines.
• A server can serve multiple clients, and a client can use multiple servers,
depending on the implementation details of a given client –server facility.
• Client identification is more difficult. Clients can be specified by their network
name or other identifier, such as IP address, but these can be spoofed (or imitate). An
unauthorized client can spoof the server into deciding that it is authorized, and the
unauthorized client could be allowed access.
· Distributed information systems, also known as distributed naming service, have been
devised to provide a unified access to the information needed for remote computing.
c) Failure Modes:
· Redundant arrays of inexpensive disks (RAID) can prevent the loss of a disk from
resulting in the loss of data.
Remote file system has more failure modes. By nature of the complexity of networking system
and the required interactions between remote machines, many more problems can interfere with
the proper operation of remote file systems.
o It is characterization of the system that specifies the semantics of multiple users
accessing a shared file simultaneously.
o These semantics should specify when modifications of data by one user are
observable by other users.
o The semantics are typically implemented as code with the file system.
o A series of file accesses (that is reads and writes) attempted by a user to the same
file is always enclosed between the open and close operations.
o The series of access between the open and close operations is a file session.
1. Writes to an open file by a user are visible immediately to other users that have this
file open at the same time.
2. One mode of sharing allows users to share the pointer of current location into the file.
Thus, the advancing of the pointer by one user affects all sharing users.
The Andrew file system (AFS) uses the following consistency semantics:
1. Writes to an open file by a user are not visible immediately to other users that have
the same file open simultaneously.
2. Once a file is closed, the changes made to it are visible only in sessions starting later.
Already open instances of the file do not reflect this change.
o Its name may not be reused and its contents may not be altered.
FILE PROTECTION
· When information is kept in a computer system, we want to keep it safe from physical
damage (reliability) and improper access (protection).
Reliability is generally provided by duplicate copies of files. Many computers have systems
programs that automatically (or though computer-operator intervention) copy disk files
to tape at regular intervals (once per day or week or month) to maintain a copy
should a file system be accidentally destroyed.
· Protection can be provided in many ways. For a small single-user system, we might
provide protection by physically removing the floppy disks and locking them in a desk
drawer or file cabinet. In a multi-user system, however, other mechanisms are needed.
2. Types of Access
· Protection mechanisms provide controlled access by limiting the types of file access that
can be made. Access is permitted or denied depending on several factors, one of which is
the type of access requested. Several different types of operations may be controlled:
5. Delete: Delete the file and free its space for possible reuse.
3. Access Control
· Associate with each file and directory an access-control list (ACL) specifying the user name
and the types of access allowed for each user.
· When a user requests access to a particular file, the operating system checks the access
list associated with that file. If that user is listed for the requestedaccess, the access is
allowed. Otherwise, a protection violation occurs and the user job is denied access to the
file.
• Constructing such a list may be a tedious and unrewarding task, especially if we do not know in
advance the list of users in the system.
• The directory entry, previously of fixed size, now needs to be of variable size, resulting in more
complicated space management.
· To condense the length of the access control list, many systems recognize three
classifications of users in connection with each file:
Group: A set of users who are sharing the file and need similar access \is a group,
or work group.
The system keeps tracks of the free disk blocks for allocating space to files when they are
created. Also, to reuse the space released from deleting the files, free space management
becomes crucial. The system maintains a free space list which keeps track of the disk blocks that
are not allocated to some file or directory. The free space list can be implemented mainly as:
1. Bitmap or Bit vector –
A Bitmap or Bit Vector is series or collection of bits where each bit corresponds to a disk
block. The bit can take two values: 0 and 1: 0 indicates that the block is allocated and 1
indicates a free block.
The given instance of disk blocks on the disk in Figure 1 (where green blocks are allocated)
can be represented by a bitmap of 16 bits as: 0000111000000110.
Advantages –
Simple to understand.
Finding the first free block is efficient. It requires scanning the words (a group of 8 bits)
in a bitmap for a non-zero word. (A 0-valued word has all bits 0). The first free block is
then found by scanning for the first 1 bit in the non-zero word.
The block number can be calculated as:
(number of bits per word) *(number of 0-values words) + offset of bit first bit 1 in the non-
zero word .
For the Figure-1, we scan the bitmap sequentially for the first non-zero word.
The first group of 8 bits (00001110) constitute a non-zero word since all bits are not 0. After
the non-0 word is found, we look for the first 1 bit. This is the 5th bit of the non-zero word.
So, offset = 5.
Therefore, the first free block number = 8*0+5 = 5.
2. Linked List –
In this approach, the free disk blocks are linked together i.e. a free block contains a pointer to
the next free block. The block number of the very first disk block is stored at a separate
location on disk and is also cached in memory.
In Figure-2, the free space list head points to Block 5 which points to Block 6, the next free
block and so on. The last free block would contain a null pointer indicating the end of free
list.
A drawback of this method is the I/O required for free space list traversal.
3. Grouping –
This approach stores the address of the free blocks in the first free block. The first free block
stores the address of some, say n free blocks. Out of these n blocks, the first n-1 blocks are
actually free and the last block contains the address of next free n blocks.
An advantage of this approach is that the addresses of a group of free disk blocks can be
found easily.
4. Counting –
This approach stores the address of the first free disk block and a number n of free
contiguous disk blocks that follow the first block.
Every entry in the list would contain:
1. Address of first free disk block
2. A number n
For example, in Figure-1, the first entry of the free space list would be: ([Address of Block
5], 2), because 2 contiguous free blocks follow block 5.
Mass-Storage Structure
Disk Structure
Each modern disk contains concentric tracks and each track is divided into multiple sectors. The
disks are usually arranged as a one dimensional array of blocks, where blocks are the smallest
storage unit.Blocks can also be called as sectors. For each surface of the disk, there is a
read/write desk available. The same tracks on all the surfaces is known as a cylinder.