0% found this document useful (0 votes)
68 views85 pages

Os Unit 5

Files are named collections of data stored on storage devices. The file management system provides I/O support across storage types, minimizes data loss, and standardizes interfaces. Files have names, permissions, and can be organized hierarchically. A file structure defines how different file types like text, programs, and objects are organized. Operating systems distinguish file types like ordinary, directory, and special files. Files have attributes that provide metadata. Access mechanisms include sequential, random, and indexed access. Space is allocated via contiguous, linked, or indexed methods. Directories store file information and support searching, creation, deletion and other operations. Directory structures can be single-level, two-level, or tree-based.

Uploaded by

balu 1203
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views85 pages

Os Unit 5

Files are named collections of data stored on storage devices. The file management system provides I/O support across storage types, minimizes data loss, and standardizes interfaces. Files have names, permissions, and can be organized hierarchically. A file structure defines how different file types like text, programs, and objects are organized. Operating systems distinguish file types like ordinary, directory, and special files. Files have attributes that provide metadata. Access mechanisms include sequential, random, and indexed access. Space is allocated via contiguous, linked, or indexed methods. Directories store file information and support searching, creation, deletion and other operations. Directory structures can be single-level, two-level, or tree-based.

Uploaded by

balu 1203
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

Operating System - File System

File
A file is a named collection of related information that is recorded on secondary storage
such as magnetic disks, magnetic tapes and optical disks. In general, a file is a
sequence of bits, bytes, lines or records whose meaning is defined by the files creator
and user.

Objective of File management System


Here are the main objectives of the file management system: ∙ It
provides I/O support for a variety of storage device types. ∙
Minimizes the chances of lost or destroyed data
∙ Helps OS to standardized I/O interface routines for user processes. ∙
It provides I/O support for multiple users in a multiuser systems
environment.

Properties of a File System


Here, are important properties of a file system:

∙ Files are stored on disk or other storage and do not disappear when a
user logs off.
∙ Files have names and are associated with access permission that
permits controlled sharing.
∙ Files could be arranged or more complex structures to reflect the
relationship between them.
Operating Systems - Unit 5 GNIT, Hyderabad.
File Structure
A File Structure should be according to a required format that the operating system can
understand.
∙ A file has a certain defined structure according to its type.
∙ A text file is a sequence of characters organized into lines.
∙ A source file is a sequence of procedures and functions.
∙ An object file is a sequence of bytes organized into blocks that are understandable
by the machine.
∙ When operating system defines different file structures, it also contains the code to
support these file structure. Unix, MS-DOS support minimum number of file
structure.

File Type
File type refers to the ability of the operating system to distinguish different types of file
such as text files source files and binary files etc. Many operating systems support
many types of files. Operating system like MS-DOS and UNIX have the following types
of files −

Ordinary files

∙ These are the files that contain user information.


∙ These may have text, databases or executable program.
∙ The user can apply various operations on such files like add, modify, delete or
even remove the entire file.

Directory files
∙ These files contain list of file names and other information related to these files.

Special files

∙ These files are also known as device files.


∙ These files represent physical device like disks, terminals, printers, networks,
tape drive etc.
These files are of two types −
∙ Character special files − data is handled character by character as in case of
terminals or printers.
∙ Block special files − data is handled in blocks as in the case of disks and tapes.

Operating Systems - Unit 5 GNIT, Hyderabad.


File Attributes
A file has a name and data. Moreover, it also stores meta information like file
creation date and time, current size, last modified date, etc. All this information
is called the attributes of a file system.

Here, are some important File attributes used in OS:

∙ Name: It is the only information stored in a human-readable form. ∙


Identifier: Every file is identified by a unique tag number within a file
system known as an identifier.
∙ Location: Points to file location on device.
∙ Type: This attribute is required for systems that support various types of
files.
∙ Size. Attribute used to display the current file size.
∙ Protection. This attribute assigns and controls the access rights of
reading, writing, and executing the file.
∙ Time, date and security: It is used for protection, security, and also used
for monitoring

Functions of File
∙ Create file, find space on disk, and make an entry in the directory. ∙
Write to file, requires positioning within the file
∙ Read from file involves positioning within the file
∙ Delete directory entry, regain disk space.
∙ Reposition: move read/write position.

Operating Systems - Unit 5 GNIT, Hyderabad.


File Access Mechanisms
File access mechanism refers to the manner in which the records of a file may be
accessed. There are several ways to access files −

∙ Sequential access
∙ Direct/Random access
∙ Indexed sequential access

Sequential access

A sequential access is that in which the records are accessed in some sequence, i.e.,
the information in the file is processed in order, one record after the other. This access
method is the most primitive one. Example: Compilers usually access files in this
fashion.

Direct/Random access

∙ Random access file organization provides, accessing the records directly.


∙ Each record has its own address on the file with by the help of which it can be
directly accessed for reading or writing.
∙ The records need not be in any sequence within the file and they need not be in
adjacent locations on the storage medium.

Indexed sequential access


∙ This mechanism is built up on base of sequential access.
∙ An index is created for each file which contains pointers to various blocks. ∙ Index
is searched sequentially and its pointer is used to access the file directly.

Space Allocation
Files are allocated disk spaces by operating system. Operating systems deploy
following three main ways to allocate disk space to files.

∙ Contiguous Allocation
∙ Linked Allocation
∙ Indexed Allocation

Contiguous Allocation

∙ Each file occupies a contiguous address space on disk.

Operating Systems - Unit 5 GNIT, Hyderabad.


∙ Assigned disk address is in linear order.
∙ Easy to implement.
∙ External fragmentation is a major issue with this type of allocation technique.

Linked Allocation

∙ Each file carries a list of links to disk blocks.


∙ Directory contains link / pointer to first block of a file.
∙ No external fragmentation
∙ Effectively used in sequential access file.
∙ Inefficient in case of direct access file.

Indexed Allocation

∙ Providessolutions to problems of contiguous and linked allocation. ∙


A index block is created having all pointers to files.
∙ Eachfile has its own index block which stores the addresses of disk space
occupied by the file.
∙ Directory contains the addresses of index blocks of files.
Operating Systems - Unit 5 GNIT, Hyderabad.
FILE DIRECTORIES:
Collection of files is a file directory. The directory contains information about the
files, including attributes, location and ownership. Much of this information,
especially that is concerned with storage, is managed by the operating system.
The directory is itself a file, accessible by various file management routines.

Information contained in a device directory are:


∙ Name
∙ Type
∙ Address
∙ Current length
∙ Maximum length
∙ Date last accessed
∙ Date last updated
∙ Owner id
∙ Protection information
Operation performed on directory are:
∙ Search for a file
∙ Create a file
∙ Delete a file
∙ List a directory
∙ Rename a file
∙ Traverse the file system
Advantages of maintaining directories are:
∙ Efficiency: A file can be located more quickly.
∙ Naming: It becomes convenient for users as two users can have same name for
different files or may have different name for same file.
∙ Grouping: Logical grouping of files can be done by properties e.g. all java
programs, all games etc.
SINGLE-LEVEL DIRECTORY
In this a single directory is maintained for all the users.
∙ Naming problem: Users cannot have same name for two files. ∙
Grouping problem: Users cannot group files according to their need.

Operating Systems - Unit 5 GNIT, Hyderabad.


TWO-LEVEL DIRECTORY
In this separate directories for each user is maintained.
∙ Path name:Due to two levels there is a path name for every file to locate that
file.
∙ Now,we can have same file name for different user.
∙ Searching is efficient in this method.
TREE-STRUCTURED DIRECTORY :
Directory is maintained in the form of a tree. Searching is efficient and also there
is grouping capability. We have absolute or relative path name for a file.

Operating Systems - Unit 5 GNIT, Hyderabad.


FILE DIRECTORIES:
Collection of files is a file directory. The directory contains information about the
files, including attributes, location and ownership. Much of this information,
especially that is concerned with storage, is managed by the operating system.
The directory is itself a file, accessible by various file management routines.

Information contained in a device directory are:


∙ Name
∙ Type
∙ Address
∙ Current length
∙ Maximum length
∙ Date last accessed
∙ Date last updated
∙ Owner id
∙ Protection information

Operation performed on directory are:


∙ Search for a file
∙ Create a file
∙ Delete a file
∙ List a directory
∙ Rename a file
∙ Traverse the file system

Advantages of maintaining directories are:


∙ Efficiency: A file can be located more quickly.
∙ Naming: It becomes convenient for users as two users can have same name for
different files or may have different name for same file.
∙ Grouping: Logical grouping of files can be done by properties e.g. all java
programs, all games etc.

SINGLE-LEVEL DIRECTORY
In this a single directory is maintained for all the users.
Naming problem: Users cannot have same name for two files. ∙
Grouping problem: Users cannot group files according to their need.

Operating Systems - Unit 5 GNIT, Hyderabad.


TWO-LEVEL DIRECTORY
In this separate directories for each user is maintained.
∙ Path name:Due to two levels there is a path name for every file to locate that
file.
∙ Now,we can have same file name for different user.
∙ Searching is efficient in this method.

TREE-STRUCTURED DIRECTORY :
Directory is maintained in the form of a tree. Searching is efficient and also

Operating Systems - Unit 5 GNIT, Hyderabad.


there is grouping capability. We have absolute or relative path name for a file.
Operating Systems - Unit 5 GNIT, Hyderabad.
FILE ALLOCATION METHODS
1. Continuous Allocation: A single continuous set of blocks is allocated to a file
at the time of file creation. Thus, this is a pre-allocation strategy, using variable
size portions. The file allocation table needs just a single entry for each file,
showing the starting block and the length of the file. This method is best from the
point of view of the individual sequential file. Multiple blocks can be read in at a
time to improve I/O performance for sequential processing. It is also easy to
retrieve a single block. For example, if a file starts at block b, and the ith block of
the file is wanted, its location on secondary storage is simply b+i-1.

Disadvantage
∙ External fragmentation will occur, making it difficult to find contiguous blocks of
space of sufficient length. Compaction algorithm will be necessary to free up
additional space on disk.
∙ Also, with pre-allocation, it is necessary to declare the size of the file at the
time of creation.
2. Linked Allocation(Non-contiguous allocation) : Allocation is on an
individual block basis. Each block contains a pointer to the next block in the
Operating Systems - Unit 5 GNIT, Hyderabad.
chain. Again the file table needs just a single entry for each file, showing the
starting block and the length of the file. Although pre-allocation is possible, it is
more common simply to allocate blocks as needed. Any free block can be
added to the chain. The blocks need not be continuous. Increase in file size is
always possible if free disk block is available. There is no external
fragmentation because only one block at a time is needed but there can be
internal fragmentation but it exists only in the last disk block of file.

Disadvantage:
∙ Internal fragmentation exists in last disk block of file.
∙ There is an overhead of maintaining the pointer in every disk block. ∙ If
the pointer of any disk block is lost, the file will be truncated. ∙ It
supports only the sequencial access of files.
3. Indexed Allocation:
It addresses many of the problems of contiguous and chained allocation. In this
case, the file allocation table contains a separate one-level index for each file:
The index has one entry for each block allocated to the file. Allocation may be on
the basis of fixed-size blocks or variable-sized blocks. Allocation by blocks
eliminates external fragmentation, whereas allocation by variable-size blocks
improves locality. This allocation technique supports both sequential and direct
Operating Systems - Unit 5 GNIT, Hyderabad.
access to the file and thus is the most popular form of file allocation.
Disk Free Space Management
Just as the space that is allocated to files must be managed ,so the space that is
not currently allocated to any file must be managed. To perform any of the file
allocation techniques,it is necessary to know what blocks on the disk are
available. Thus we need a disk allocation table in addition to a file allocation
table.The following are the approaches used for free space management.
1. Bit Tables : This method uses a vector containing one bit for each block on
the disk. Each entry for a 0 corresponds to a free block and each 1
corresponds to a block in use.
For example: 00011010111100110001
In this vector every bit correspond to a particular block and 0 implies that,
that particular block is free and 1 implies that the block is already occupied.
A bit table has the advantage that it is relatively easy to find one or a
contiguous group of free blocks. Thus, a bit table works well with any of the
file allocation methods. Another advantage is that it is as small as possible.
2. Free Block List : In this method, each block is assigned a number
sequentially and the list of the numbers of all free blocks is maintained in a
reserved block of the disk.
Operating Systems - Unit 5 GNIT, Hyderabad.

Operating Systems - Unit 5 GNIT, Hyderabad.


Structures of Directory in Operating
System
A directory is a container that is used to contain folders and files. It organizes files
and folders in a hierarchical manner.

There are several logical structures of a directory, these are given below.
∙ Single-level directory –
The single-level directory is the simplest directory structure. In it, all files
are contained in the same directory which makes it easy to support and
understand.
A single level directory has a significant limitation, however, when the
number of files increases or when the system has more than one user. Since
all the files are in the same directory, they must have a unique name. if two
users call their dataset test, then the unique name rule violated.

Operating Systems - Unit 5 GNIT, Hyderabad.


Advantages:
∙ Since it is a single directory, so its implementation is very easy. ∙
If the files are smaller in size, searching will become faster.
∙ The operations like file creation, searching, deletion, updating are very easy
in such a directory structure.
Disadvantages:
∙ There may chance of name collision because two files can not have the same
name.
∙ Searching will become time taking if the directory is large.
∙ This can not group the same type of files together.
∙ Two-level directory –
As we have seen, a single level directory often leads to confusion of files
names among different users. the solution to this problem is to create a
separate directory for each user.
In the two-level directory structure, each user has their own user files
directory (UFD). The UFDs have similar structures, but each lists only the
files of a single user. system’s master file directory (MFD) is searches
whenever a new user id=s logged in. The MFD is indexed by username or
account number, and each entry points to the UFD for that user.

Advantages:
∙ We can give full path like /User-name/directory-name/.
∙ Different users can have the same directory as well as the file name. ∙
Searching of files becomes easier due to pathname and user-grouping.
Disadvantages:
∙ A user is not allowed to share files with other users.
Operating Systems - Unit 5 GNIT, Hyderabad.
∙ Still,
it not very scalable, two files of the same type cannot be grouped
together in the same user.

∙ Tree-structured directory –
Once we have seen a two-level directory as a tree of height 2, the natural
generalization is to extend the directory structure to a tree of arbitrary
height.
This generalization allows the user to create their own subdirectories and to
organize their files accordingly.

A tree structure is the most common directory structure. The tree has a root directory,
and every file in the system has a unique path.
Advantages:
∙ Very general, since full pathname can be given.
∙ Very scalable, the probability of name collision is less.
∙ Searching becomes very easy, we can use both absolute paths as well as
relative.
Disadvantages:
Operating Systems - Unit 5 GNIT, Hyderabad.
∙ Every file does not fit into the hierarchical model, files may be saved into
multiple directories.
∙ We can not share files.
∙ It is inefficient, because accessing a file may go under multiple directories.

∙ Acyclic graph directory –


An acyclic graph is a graph with no cycle and allows us to share
subdirectories and files. The same file or subdirectories may be in two
different directories. It is a natural generalization of the tree-structured
directory.
It is used in the situation like when two programmers are working on a joint
project and they need to access files. The associated files are stored in a
subdirectory, separating them from other projects and files of other
programmers since they are working on a joint project so they want the
subdirectories to be into their own directories. The common subdirectories
should be shared. So here we use Acyclic directories.
It is the point to note that the shared file is not the same as the copy file. If
any programmer makes some changes in the subdirectory it will reflect in
both subdirectories.
Operating Systems - Unit 5 GNIT, Hyderabad.
Advantages:
∙ We can share files.
∙ Searching is easy due to different-different paths.
Disadvantages:
∙ We share the files via linking, in case deleting it may create the problem, ∙
If the link is a soft link then after deleting the file we left with a dangling
pointer.
∙ In the case of a hard link, to delete a file we have to delete all the references
associated with it.

∙ General graph directory structure –


In general graph directory structure, cycles are allowed within a directory
structure where multiple directories can be derived from more than one
parent directory.
The main problem with this kind of directory structure is to calculate the
total size or space that has been taken by the files and directories.
Advantages:
∙ It allows cycles.
∙ It is more flexible than other directories structure.
Disadvantages:
∙ It is more costly than others.
∙ It needs garbage collection.

Operating Systems - Unit 5 GNIT, Hyderabad.

File Allocation Methods


The allocation methods define how the files are stored in the disk blocks. There are
three main disk space or file allocation methods.
∙ Contiguous Allocation
∙ Linked Allocation
∙ Indexed Allocation
The main idea behind these methods is to provide:
∙ Efficientdisk space utilization.
∙ Fast access to the file blocks.
All the three methods have their own advantages and disadvantages as discussed
below:
1. Contiguous Allocation
In this scheme, each file occupies a contiguous set of blocks on the disk. For example,
if a file requires n blocks and is given a block b as the starting location, then the
blocks assigned to the file will be: b, b+1, b+2,……b+n-1. This means that given the
starting block address and the length of the file (in terms of blocks required), we can
determine the blocks occupied by the file.
The directory entry for a file with contiguous allocation contains
∙ Address of starting block
∙ Length of the allocated portion.
The file ‘mail’ in the following figure starts from the block 19 with length = 6 blocks.
Therefore, it occupies 19, 20, 21, 22, 23, 24 blocks.

Operating Systems - Unit 5 GNIT, Hyderabad.


Advantages:
∙ Both the Sequential and Direct Accesses are supported by this. For direct access,
the address of the kth block of the file which starts at block b can easily be
obtained as (b+k).
∙ This is extremely fast since the number of seeks are minimal because of
contiguous allocation of file blocks.

Disadvantages:
∙ This method suffers from both internal and external fragmentation. This makes it
inefficient in terms of memory utilization.
∙ Increasing file size is difficult because it depends on the availability of contiguous
memory at a particular instance.
2. Linked List Allocation
Operating Systems - Unit 5 GNIT, Hyderabad.
In this scheme, each file is a linked list of disk blocks which need not be contiguous.
The disk blocks can be scattered anywhere on the disk.
The directory entry contains a pointer to the starting and the ending file block. Each
block contains a pointer to the next block occupied by the file.
The file ‘jeep’ in following image shows how the blocks are randomly distributed. The
last block (25) contains -1 indicating a null pointer and does not point to any other
block.

Advantages:
∙ This is very flexible in terms of file size. File size can be increased easily since the
system does not have to look for a contiguous chunk of memory.

16

∙ This method does not suffer from external fragmentation. This makes it relatively
better in terms of memory utilization.
Disadvantages:
∙ Because the file blocks are distributed randomly on the disk, a large number of
seeks are needed to access every block individually. This makes linked allocation
slower.
∙ It does not support random or direct access. We can not directly access the blocks
of a file. A block k of a file can be accessed by traversing k blocks sequentially
(sequential access ) from the starting block of the file via block pointers.

Operating Systems - Unit 5 GNIT, Hyderabad.


∙ Pointers required in the linked allocation incur some extra overhead.
3. Indexed Allocation
In this scheme, a special block known as the Index block contains the pointers to all
the blocks occupied by a file. Each file has its own index block. The ith entry in the
index block contains the disk address of the ith file block. The directory entry contains
the address of the index block as shown in the image:

Advantages:
∙ This supports direct access to the blocks occupied by the file and therefore
provides fast access to the file blocks.
∙ It overcomes the problem of external fragmentation.
Disadvantages:
∙ The pointer overhead for indexed allocation is greater than linked allocation. ∙ For
very small files, say files that expand only 2-3 blocks, the indexed allocation would
keep one entire block (index block) for the pointers which is inefficient in terms of
memory utilization. However, in linked allocation we lose the space of only 1
pointer per block.

Operating Systems - Unit 5 GNIT, Hyderabad.


For files that are very large, single index block may not be able to hold all the
pointers.
Following mechanisms can be used to resolve this:
1. Linked scheme: This scheme links two or more index blocks together for holding
the pointers. Every index block would then contain a pointer or the address to the
next index block.
2. Multilevel index: In this policy, a first level index block is used to point to the
second level index blocks which inturn points to the disk blocks occupied by the file.
This can be extended to 3 or more levels depending on the maximum file size.
3. Combined Scheme: In this scheme, a special block called the Inode (information
Node) contains all the information about the file such as the name, size, authority, etc
and the remaining space of Inode is used to store the Disk Block addresses which
contain the actual file as shown in the image below. The first few of these pointers in
Inode point to the direct blocks i.e the pointers contain the addresses of the disk
blocks that contain data of the file. The next few pointers point to indirect blocks.
Indirect blocks may be single indirect, double indirect or triple indirect. Single
Indirect block is the disk block that does not contain the file data but the disk
address of the blocks that contain the file data. Similarly, double indirect blocks do
not contain the file data but the disk address of the blocks that contain the address of
the blocks containing the file data.
Operating
Systems - Unit 5 GNIT, Hyderabad.
Operating Systems - Unit 5 GNIT, Hyderabad.

What is a File ?
A file can be defined as a data structure which stores the sequence of records. Files are
stored in a file system, which may exist on a disk or in the main memory. Files can be
simple (plain text) or complex (specially-formatted).

The collection of files is known as Directory. The collection of directories at the different
levels, is known as File System.
Attributes of the File
1.Name

Every file carries a name by which the file is recognized in the file system. One directory
cannot have two files with the same name.

2.Identifier

Along with the name, Each File has its own extension which identifies the type of the file.
For example, a text file has the extension .txt, A video file can have the extension .mp4.

Operating Systems - Unit 5 GNIT, Hyderabad.


Hello Java Program for Beginners

3.Type

In a File System, the Files are classified in different types such as video files, audio files,
text files, executable files, etc.

4.Location

In the File System, there are several locations on which, the files can be stored. Each
file carries its location as its attribute.

5.Size

The Size of the File is one of its most important attribute. By size of the file, we mean the
number of bytes acquired by the file in the memory.

6.Protection

The Admin of the computer may want the different protections for the different files. Therefore
each file carries its own set of permissions to the different group of Users.

7.Time and Date

Every file carries a time stamp which contains the time and date on which the file is last
modified.

Operations on the File


There are various operations which can be implemented on a file. We will see all of them
in detail.

1.Create

Creation of the file is the most important operation on the file. Different types of files are
created by different methods for example text editors are used to create a text file, word
processors are used to create a word file and Image editors are used to create the
image files.

2.Write

Operating Systems - Unit 5 GNIT, Hyderabad.


Writing the file is different from creating the file. The OS maintains a write pointer for
every file which points to the position in the file from which, the data needs to be written.

3.Read

Every file is opened in three different modes : Read, Write and append. A Read pointer
is maintained by the OS, pointing to the position up to which, the data has been read.

4.Re-position
Re-positioning is simply moving the file pointers forward or backward depending upon
the user's requirement. It is also called as seeking.

5.Delete

Deleting the file will not only delete all the data stored inside the file, It also deletes all
the attributes of the file. The space which is allocated to the file will now become
available and can be allocated to the other files.

6.Truncate

Truncating is simply deleting the file except deleting attributes. The file is not completely
deleted although the information stored inside the file get replaced.

File Access Methods

Operating Systems - Unit 5 GNIT, Hyderabad.


Let's look at various ways to access files stored in secondary

memory. Sequential Access


Most of the operating systems access the file sequentially. In other words, we can say
that most of the files need to be accessed sequentially by the operating system.

In sequential access, the OS read the file word by word. A pointer is maintained which
initially points to the base address of the file. If the user wants to read first word of the
file then the pointer provides that word to the user and increases its value by 1 word.
This process continues till the end of the file.

Modern word systems do provide the concept of direct access and indexed access but
the most used method is sequential access due to the fact that most of the files such as
text files, audio files, video files, etc need to be sequentially accessed.

Direct Access

Operating Systems - Unit 5 GNIT, Hyderabad.


The Direct Access is mostly required in the case of database systems. In most of the
cases, we need filtered information from the database. The sequential access can be
very slow and inefficient in such cases.

Suppose every block of the storage stores 4 records and we know that the record we
needed is stored in 10th block. In that case, the sequential access will not be
implemented because it will traverse all the blocks in order to access the needed record.

SQL CREATE TABLE

Direct access will give the required result despite of the fact that the operating system
has to perform some complex tasks such as determining the desired block number.
However, that is generally implemented in database applications.

Indexed Access

Operating Systems - Unit 5 GNIT, Hyderabad.


If a file can be sorted on any of the filed then an index can be assigned to a group of
certain records. However, A particular record can be accessed by its index. The index is
nothing but the address of a record in the file.

In index accessing, searching in a large database became very quick and easy but we
need to have some extra space in the memory to store the index value.
Directory Structure

Operating Systems - Unit 5 GNIT, Hyderabad.


What is a directory?
Directory can be defined as the listing of the related files on the disk. The directory may
store some or the entire file attributes.
To get the benefit of different file systems on the different operating systems, A hard
disk can be divided into the number of partitions of different sizes. The partitions are
also called volumes or mini disks.

Each partition must have at least one directory in which, all the files of the partition can
be listed. A directory entry is maintained for each file in the directory which stores all the
information related to that file.

A directory can be viewed as a file which contains the Meta data of the bunch of

files. Every Directory supports a number of common operations on the file:

1. File Creation
2. Search for the file

Operating Systems - Unit 5 GNIT, Hyderabad.


3. File deletion
4. Renaming the file
5. Traversing Files
6. Listing of files
Single Level Directory
The simplest method is to have one big list of all the files on the disk. The entire system
will contain only one directory which is supposed to mention all the files present in the
file system. The directory contains one entry per each file present on the file system.

This type of directories can be used for a simple system.

Advantages
1. Implementation is very simple.
2. If the sizes of the files are very small then the searching becomes faster. 3. File
creation, searching, deletion is very simple since we have only one directory.

Disadvantages
1. We cannot have two files with the same name.

Operating Systems - Unit 5 GNIT, Hyderabad.


2. The directory may be very big therefore searching for a file may take so much
time.
3. Protection cannot be implemented for multiple users.
4. There are no ways to group same kind of files.
5. Choosing the unique name for every file is a bit complex and limits the number of
files in the system because most of the Operating System limits the number of
characters used to construct the file name.

Two Level Directory


In two level directory systems, we can create a separate directory for each user. There
is one master directory which contains separate directories dedicated to each user. For
each user, there is a different directory present at the second level, containing group of
user's file. The system doesn't let a user to enter in the other user's directory without
permission.

Characteristics of two level directory system


1. Each files has a path name as /User-name/directory-name/
2. Different users can have the same file name.

Operating Systems - Unit 5 GNIT, Hyderabad.


3. Searching becomes more efficient as only one user's list needs to be traversed. 4.
The same kind of files cannot be grouped into a single directory for a particular user.

Every Operating System maintains a variable as PWD which contains the present
directory name (present user name) so that the searching can be done appropriately.
Tree Structured Directory
In Tree structured directory system, any directory entry can either be a file or sub
directory. Tree structured directory system overcomes the drawbacks of two level
directory system. The similar kind of files can now be grouped in one directory.

Each user has its own directory and it cannot enter in the other user's directory.
However, the user has the permission to read the root's data but he cannot write or
modify this. Only administrator of the system has the complete access of root directory.

Searching is more efficient in this directory structure. The concept of current working
directory is used. A file can be accessed by two types of path, either relative or absolute.

Absolute path is the path of the file with respect to the root directory of the system while
relative path is the path with respect to the current working directory of the system. In
tree structured directory systems, the user is given the privilege to create the files as
well as directories.

Operating Systems - Unit 5 GNIT, Hyderabad.


Permissions on the file and directory
A tree structured directory system may consist of various levels therefore there is a set
of permissions assigned to each file and directory.

The permissions are R W X which are regarding reading, writing and the execution of
the files or directory. The permissions are assigned to three types of users: owner, group
and others.

There is a identification bit which differentiate between directory and file. For a directory,
it is d.

Acyclic-Graph Structured Directories


The tree structured directory system doesn't allow the same file to exist in multiple
directories therefore sharing is major concern in tree structured directory system. We
can provide sharing by making the directory an acyclic graph. In this system, two or
more directory entry can point to the same file or sub directory. That file or sub directory
is shared between the two directory entries.

These kinds of directory graphs can be made using links or aliases. We can have
multiple paths for a same file. Links can either be symbolic (logical) or hard link
(physical).

Operating Systems - Unit 5 GNIT, Hyderabad.


If a file gets deleted in acyclic graph structured directory system, then
1. In the case of soft link, the file just gets deleted and we are left with a dangling
pointer.

2. In the case of hard link, the actual file will be deleted only if all the references to it
gets deleted.

Operating Systems -

Unit 5 GNIT, Hyderabad.

File Systems
File system is the part of the operating system which is responsible for file management.
It provides a mechanism to store the data and access to the file contents including data
and programs. Some Operating systems treats everything as a file for example Ubuntu.

The File system takes care of the following issues

o File Structure

We have seen various data structures in which the file can be stored. The task of
the file system is to maintain an optimal file structure.

o Recovering Free space

Whenever a file gets deleted from the hard disk, there is a free space created in
the disk. There can be many such spaces which need to be recovered in order to
reallocate them to other files.

o disk space assignment to the files

The major concern about the file is deciding where to store the files on the hard
disk. There are various disks scheduling algorithm which will be covered later in
this tutorial.

o tracking data location

A File may or may not be stored within only one block. It can be stored in the non
contiguous blocks on the disk. We need to keep track of all the blocks on which
the part of the files reside.

Operating Systems - Unit 5 GNIT, Hyderabad.

File System Structure


File System provide efficient access to the disk by allowing data to be stored, located
and retrieved in a convenient way. A file System must be able to store the file, locate the
file and retrieve the file.

Most of the Operating Systems use layering approach for every task including file
systems. Every layer of the file system is responsible for some activities.

The image shown below, elaborates how the file system is divided in different layers,
and also the functionality of each layer.

o When an application program asks for a file, the first request is directed to the logical file
system. The logical file system contains the Meta data of the file and directory structure.
If the application program doesn't have the required permissions of the file then this
layer will throw an error. Logical file systems also verify the path to the file.
o Generally, files are divided into various logical blocks. Files are to be stored in the hard
disk and to be retrieved from the hard disk. Hard disk is divided into various tracks and
sectors. Therefore, in order to store and retrieve the files, the logical blocks need to be
mapped to physical blocks. This mapping is done by File organization module. It is also
responsible for free space management.

o Once File organization module decided which physical block the application program
needs, it passes this information to basic file system. The basic file system is responsible
for issuing the commands to I/O control in order to fetch those blocks.
o I/O controls contain the codes by using which it can access hard disk. These codes are
known as device drivers. I/O controls are also responsible for handling interrupts.

Operating Systems - Unit 5 GNIT, Hyderabad.

Master Boot Record (MBR)


Master boot record is the information present in the first sector of any hard disk. It
contains the information regarding how and where the Operating system is located in
the hard disk so that it can be booted in the RAM.

MBR is sometimes called master partition table because it includes a partition table
which locates every partition in the hard disk.

Master boot record (MBR) also includes a program which reads the boot sector record
of the partition that contains operating system.
What happens when you turn on your computer?
Due to the fact that the main memory is volatile, when we turn on our computer, CPU
cannot access the main memory directly. However, there is a special program called as
BIOS stored in ROM is accessed for the first time by the CPU.

BIOS contains the code, by executing which, the CPU access the very first partition of
hard disk that is MBR. It contains a partition table for all the partitions of the hard disk.

Since, MBR contains the information about where the operating system is being stored
and it also contains a program which can read the boot sector record of the partition,
hence the CPU fetches all this information and load the operating system into the main
memory.

Operating Systems - Unit 5 GNIT, Hyderabad.

On Disk Data Structures


There are various on disk data structures that are used to implement a file system. This
structure may vary depending upon the operating system.

1. Boot Control Block

Boot Control Block contains all the information which is needed to boot an
operating system from that volume. It is called boot block in UNIX file system. In
NTFS, it is called the partition boot sector.
2. Volume Control Block

Volume control block all the information regarding that volume such as number of
blocks, size of each block, partition table, pointers to free blocks and free FCB
blocks. In UNIX file system, it is known as super block. In NTFS, this information
is stored inside master file table.

3. Directory Structure (per file system)

A directory structure (per file system) contains file names and pointers to
corresponding FCBs. In UNIX, it includes inode numbers associated to file
names.

4. File Control Block

File Control block contains all the details about the file such as ownership details,
permission details, file size,etc. In UFS, this detail is stored in inode. In NTFS, this
information is stored inside master file table as a relational database structure. A
typical file control block is shown in the image below.

Operating Systems - Unit 5 GNIT, Hyderabad.

In Memory Data Structure


Till now, we have discussed the data structures that are required to be present on the
hard disk in order to implement file systems. Here, we will discuss the data structures
required to be present in memory in order to implement the file system.

The in-memory data structures are used for file system management as well as
performance improvement via caching. This information is loaded on the mount time
and discarded on ejection.

1. In-memory Mount Table

In-memory mount table contains the list of all the devices which are being
mounted to the system. Whenever the connection is maintained to a device, its
entry will be done in the mount table.

2. In-memory Directory structure cache

This is the list of directory which is recently accessed by the CPU. The directories
present in the list can also be accessed in the near future so it will be better to
store them temporally in cache.

3. System-wide open file table

This is the list of all the open files in the system at a particular time. Whenever the
user open any file for reading or writing, the entry will be made in this open file
table.

4. Per process Open file table

It is the list of open files subjected to every process. Since there is already a list
which is there for every open file in the system thereforeIt only contains Pointers
to the appropriate entry in the system wide table.

Operating Systems - Unit 5 GNIT, Hyderabad.

Directory Implementation
There is the number of algorithms by using which, the directories can be implemented.
However, the selection of an appropriate directory implementation algorithm may
significantly affect the performance of the system.

The directory implementation algorithms are classified according to the data structure
they are using. There are mainly two algorithms which are used in these days.

1. Linear List
In this algorithm, all the files in a directory are maintained as singly lined list. Each file
contains the pointers to the data blocks which are assigned to it and the next file in the
directory.

Characteristics

1. When a new file is created, then the entire list is checked whether the new file name is
matching to a existing file name or not. In case, it doesn't exist, the file can be created at
the beginning or at the end. Therefore, searching for a unique name is a big concern
because traversing the whole list takes time.

2. The list needs to be traversed in case of every operation (creation, deletion, updating,
etc) on the files therefore the systems become inefficient.

2. Hash Table
To overcome the drawbacks of singly linked list implementation of directories, there is
an alternative approach that is hash table. This approach suggests to use hash table
along with the linked lists.

A key-value pair for each file in the directory gets generated and stored in the hash
table. The key can be determined by applying the hash function on the file name while
the key points to the corresponding file stored in the directory.

Now, searching becomes efficient due to the fact that now, entire list will not be
searched on every operating. Only hash table entries are checked using the key and if
an entry found then the corresponding file will be fetched using the value.

Operating Systems - Unit 5 GNIT, Hyderabad.

Allocation Methods
There are various methods which can be used to allocate disk space to the files.
Selection of an appropriate allocation method will significantly affect the performance
and efficiency of the system. Allocation method provides a way in which the disk will be
utilized and the files will be accessed.

There are following methods which can be used for allocation.


1. Contiguous Allocation.
2. Extents
3. Linked Allocation
4. Clustering
5. FAT
6. Indexed Allocation
7. Linked Indexed Allocation
8. Multilevel Indexed Allocation
9. Inode

We will discuss three of the most used methods in detail.

Contiguous Allocation
If the blocks are allocated to the file in such a way that all the logical blocks of the file
get the contiguous physical block in the hard disk then such allocation scheme is known
as contiguous allocation.

In the image shown below, there are three files in the directory. The starting block and
the length of each file are mentioned in the table. We can check in the table that the
contiguous blocks are assigned to each file as per its need.

Operating Systems - Unit 5 GNIT, Hyderabad.


Advantages
1. It is simple to implement.
2. We will get Excellent read performance.
3. Supports Random Access into files.

Disadvantages
1. The disk will become fragmented.
2. It may be difficult to have a file grow.

Operating Systems - Unit 5 GNIT, Hyderabad.

Linked List Allocation


Linked List allocation solves all problems of contiguous allocation. In linked list
allocation, each file is considered as the linked list of disk blocks. However, the disks
blocks allocated to a particular file need not to be contiguous on the disk. Each disk
block allocated to a file contains a pointer which points to the next disk block allocated
to the same file.

Advantages
1. There is no external fragmentation with linked allocation.
2. Any free block can be utilized in order to satisfy the file block requests.
3. File can continue to grow as long as the free blocks are available. 4.
Directory entry will only contain the starting block address.

Disadvantages
1. Random Access is not provided.
2. Pointers require some space in the disk blocks.
3. Any of the pointers in the linked list must not be broken otherwise the file will get
corrupted.
4. Need to traverse each block.

Operating Systems - Unit 5 GNIT, Hyderabad.

File Allocation Table


The main disadvantage of linked list allocation is that the Random access to a particular
block is not provided. In order to access a block, we need to access all its previous
blocks.

File Allocation Table overcomes this drawback of linked list allocation. In this scheme, a
file allocation table is maintained, which gathers all the disk block links. The table has
one entry for each disk block and is indexed by block number.

File allocation table needs to be cached in order to reduce the number of head seeks.
Now the head doesn't need to traverse all the disk blocks in order to access one
successive block.

It simply accesses the file allocation table, read the desired block entry from there and
access that block. This is the way by which the random access is accomplished by using
FAT. It is used by MS-DOS and pre-NT Windows versions.

Advantages

Operating Systems - Unit 5 GNIT, Hyderabad.


1. Uses the whole disk block for data.
2. A bad disk block doesn't cause all successive blocks lost.
3. Random access is provided although its not too fast.
4. Only FAT needs to be traversed in each file operation.

Disadvantages
1. Each Disk block needs a FAT entry.
2. FAT size may be very big depending upon the number of FAT entries. 3. Number
of FAT entries can be reduced by increasing the block size but it will also increase
Internal Fragmentation.

Indexed Allocation
Limitation of FAT
Limitation in the existing technology causes the evolution of a new technology. Till now,
we have seen various allocation methods; each of them was carrying several
advantages and disadvantages.

File allocation table tries to solve as many problems as possible but leads to a
drawback. The more the number of blocks, the more will be the size of FAT.

Therefore, we need to allocate more space to a file allocation table. Since, file allocation
table needs to be cached therefore it is impossible to have as many space in cache.
Here we need a new technology which can solve such problems.

Indexed Allocation Scheme


Instead of maintaining a file allocation table of all the disk pointers, Indexed allocation
scheme stores all the disk pointers in one of the blocks called as indexed block. Indexed

Operating Systems - Unit 5 GNIT, Hyderabad.


block doesn't hold the file data, but it holds the pointers to all the disk blocks allocated
to that particular file. Directory entry will only contain the index block address.
Advantages
1. Supports direct access
2. A bad data block causes the lost of only that block.

Disadvantages
1. A bad index block could cause the lost of entire file.
2. Size of a file depends upon the number of pointers, a index block can hold.
3. Having an index block for a small file is totally wastage.
4. More pointer overhead

Operating Systems - Unit 5 GNIT, Hyderabad.

Linked Index Allocation


Single level linked Index Allocation
In index allocation, the file size depends on the size of a disk block. To allow large files,
we have to link several index blocks together. In linked index allocation,

o Small header giving the name of the file

o Set of the first 100 block addresses

o Pointer to another index block

For the larger files, the last entry of the index block is a pointer which points to another
index block. This is also called as linked schema.

Advantage: It removes file size limitations

Disadvantage: Random Access becomes a bit harder

Multilevel Index Allocation


In Multilevel index allocation, we have various levels of indices. There are outer level
index blocks which contain the pointers to the inner level index blocks and the inner
level index blocks contain the pointers to the file data.

o The outer level index is used to find the inner level index.

Operating Systems - Unit 5 GNIT, Hyderabad.


o The inner level index is used to find the desired data block.

Advantage: Random Access becomes better and efficient.

Triggers in SQL (Hindi)


Disadvantage: Access time for a file will be higher.

Operating
Systems - Unit 5 GNIT, Hyderabad.

Inode
In UNIX based operating systems, each file is indexed by an Inode. Inode are the
special disk block which is created with the creation of the file system. The number of
files or directories in a file system depends on the number of Inodes in the file system.

An Inode includes the following information

1. Attributes (permissions, time stamp, ownership details, etc) of the file 2. A number of
direct blocks which contains the pointers to first 12 blocks of the file.
3. A single indirect pointer which points to an index block. If the file cannot be indexed
entirely by the direct blocks then the single indirect pointer is used. 4. A double indirect
pointer which points to a disk block that is a collection of the pointers to the disk blocks
which are index blocks. Double index pointer is used if the file is too big to be indexed
entirely by the direct blocks as well as the single indirect pointer.
5. A triple index pointer that points to a disk block that is a collection of pointers. Each
of the pointers is separately pointing to a disk block which also contains a collection of
pointers which are separately pointing to an index block that contains the pointers to
the file blocks.

Operating Systems - Unit 5 GNIT, Hyderabad.


https://www.youtube.com/watch?v=_6VJ8WfWI4k

inode video
Operating Systems - Unit 5 GNIT, Hyderabad.
Disk Scheduling Algorithms

Disk scheduling is done by operating systems to schedule I/O requests arriving for the disk.
Disk scheduling is also known as I/O scheduling.
Disk scheduling is important because:
∙Multiple I/O requests may arrive by different processes and only one I/O request can be
served at a time by the disk controller. Thus other I/O requests need to wait in the
waiting queue and need to be scheduled.
∙Two or more request may be far from each other so can result in greater disk arm
movement.
∙Hard drives are one of the slowest parts of the computer system and thus need to be
accessed in an efficient manner.
There are many Disk Scheduling Algorithms but before discussing them let’s have a quick look
at some of the important terms:
∙Seek Time:Seek time is the time taken to locate the disk arm to a specified track where
the data is to be read or write. So the disk scheduling algorithm that gives minimum
average seek time is better.
∙Rotational Latency: Rotational Latency is the time taken by the desired sector of disk to
rotate into a position so that it can access the read/write heads. So the disk
scheduling algorithm that gives minimum rotational latency is better.
∙Transfer Time: Transfer time is the time to transfer the data. It depends on the rotating
speed of the disk and number of bytes to be transferred.
∙Disk Access Time: Disk Access Time is:
Disk Access Time = Seek Time + Rotational Latency + Transfer Time

∙Disk
Response Time: Response Time is the average of time spent by a request waiting to perform
its I/O operation. Average Response time is the response time of the all

Operating Systems - Unit 5 GNIT, Hyderabad.


requests. Variance Response Time is measure of how individual request are serviced
with respect to average response time. So the disk scheduling algorithm that gives
minimum variance response time is better.

Purpose of Disk Scheduling


The main purpose of disk scheduling algorithm is to select a disk request from the queue of IO
requests and decide the schedule when this request will be processed.
Goal of Disk Scheduling Algorithm
o Fairness
o High throughout
o Minimal traveling head time

Disk Scheduling Algorithms


The list of various disks scheduling algorithm is given below. Each algorithm is carrying some
advantages and disadvantages. The limitation of each algorithm leads to the evolution of a new
algorithm.
o FCFS scheduling algorithm
o SSTF (shortest seek time first) algorithm
o SCAN scheduling
o C-SCAN scheduling
o LOOK Scheduling
o C-LOOK scheduling

Operating Systems - Unit 5 GNIT, Hyderabad.


Disk Scheduling Algorithms

FCFS: FCFS is the simplest of all the Disk Scheduling Algorithms. In FCFS, the requests are
addressed in the order they arrive in the disk queue. Let us understand this with the help of an
example.
Example:
Suppose the order of request is- (82,170,43,140,24,16,190)
And current position of Read/Write head is : 50

So, total seek time:


=(82-50)+(170-82)+(170-43)+(140-43)+(140-24)+(24-16)+(190-16)
=642
Advantages:
∙Every request gets a fair chance
∙No indefinite postponement
Disadvantages:
∙Does not try to optimize seek time
∙May not provide the best possible service

Operating Systems - Unit 5 GNIT, Hyderabad.


SSTF: In SSTF (Shortest Seek Time First), requests having shortest seek time are executed
first. So, the seek time of every request is calculated in advance in the queue and then they are
scheduled according to their calculated seek time. As a result, the request near the disk arm will
get executed first. SSTF is certainly an improvement over FCFS as it decreases the average
response time and increases the throughput of system. Let us understand this with the help of
an example.
Example:
Suppose the order of request is- (82,170,43,140,24,16,190)
And current position of Read/Write head is : 50

So, total seek time:


=(50-43)+(43-24)+(24-16)+(82-16)+(140-82)+(170-40)+(190-170)
=208
Advantages:
∙Average Response Time decreases
∙Throughput increases
Disadvantages:
∙Overhead to calculate seek time in advance
∙Can cause Starvation for a request if it has higher seek time as compared to incoming
requests
∙High variance of response time as SSTF favours only some requests

Operating Systems - Unit 5 GNIT, Hyderabad.


SCAN: In SCAN algorithm the disk arm moves into a particular direction and services the
requests coming in its path and after reaching the end of disk, it reverses its direction and again
services the request arriving in its path. So, this algorithm works as an elevator and hence also
known as elevator algorithm. As a result, the requests at the midrange are serviced more and
those arriving behind the disk arm will have to wait.
Example:
Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm
is at 50, and it is also given that the disk arm should move “towards the larger value”.

Therefore, the seek time is calculated as:


=(199-50)+(199-16)
=332
Advantages:
∙High throughput
∙Low variance of response time
∙Average response time
Disadvantages:
∙Long waiting time for requests for locations just visited by disk arm

Operating Systems - Unit 5 GNIT, Hyderabad.


CSCAN: In SCAN algorithm, the disk arm again scans the path that has been scanned, after
reversing its direction. So, it may be possible that too many requests are waiting at the other
end or there may be zero or few requests pending at the scanned area.
These situations are avoided in CSCAN algorithm in which the disk arm instead of reversing its
direction goes to the other end of the disk and starts servicing the requests from there. So, the
disk arm moves in a circular fashion and this algorithm is also similar to SCAN algorithm and
hence it is known as C-SCAN (Circular SCAN).
Example:
Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm
is at 50, and it is also given that the disk arm should move “towards the larger value”.

Seek time is
calculated as:
=(199-50)+(199-0)+(43-0)
=391
Advantages:
∙Provides more uniform wait time compared to SCAN

Operating Systems - Unit 5 GNIT, Hyderabad.


LOOK: It is similar to the SCAN disk scheduling algorithm except for the difference that the disk
arm in spite of going to the end of the disk goes only to the last request to be serviced in front of
the head and then reverses its direction from there only. Thus it prevents the extra delay which
occurred due to unnecessary traversal to the end of the disk.
Example:
Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm
is at 50, and it is also given that the disk arm should move “towards the larger value”.

So, the
seek time is calculated as:
=(190-50)+(190-16)
=314

Operating Systems - Unit 5 GNIT, Hyderabad.


CLOOK: As LOOK is similar to SCAN algorithm, in similar way, CLOOK is similar to CSCAN
disk scheduling algorithm. In CLOOK, the disk arm in spite of going to the end goes only to the
last request to be serviced in front of the head and then from there goes to the other end’s last
request. Thus, it also prevents the extra delay which occurred due to unnecessary traversal to
the end of the disk.

Example:
Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm
is at 50, and it is also given that the disk arm should move “towards the larger value”

So, the
seek time is calculated as:
=(190-50)+(190-16)+(43-16)
=341

Operating Systems - Unit 5 GNIT, Hyderabad.


RSS– It stands for random scheduling and just like its name it is nature. It is used in situations
where scheduling involves random attributes such as random processing time, random due
dates, random weights, and stochastic machine breakdowns this algorithm sits perfect. Which is
why it is usually used for and analysis and simulation.
LIFO– In LIFO (Last In, First Out) algorithm, newest jobs are serviced before the existing ones
i.e. in order of requests that get serviced the job that is newest or last entered is serviced first
and then the rest in the same order.
Advantages
∙ Maximizes locality and resource utilization
Disadvantages
∙ Can seem a little unfair to other requests and if new requests keep coming in, it
cause starvation to the old and existing ones.
Example
Suppose the order of request is- (82,170,43,142,24,16,190)
And current position of Read/Write head is : 50

Operating
Systems - Unit 5 GNIT, Hyderabad.
N-STEP SCAN – It is also known as N-STEP LOOK algorithm. In this a buffer is created for N
requests. All requests belonging to a buffer will be serviced in one go. Also once the buffer is full
no new requests are kept in this buffer and are sent to another one. Now, when these N
requests are serviced, the time comes for another top N requests and this way all get requests
get a guaranteed service
Advantages
∙ It eliminates starvation of requests completely

FSCAN– This algorithm uses two sub-queues. During the scan all requests in the first queue
are serviced and the new incoming requests are added to the second queue. All new requests
are kept on halt until the existing requests in the first queue are serviced. Advantages
∙ FSCAN along with N-Step-SCAN prevents “arm stickiness” (phenomena in I/O
scheduling where the scheduling algorithm continues to service requests at or
near the current sector and thus prevents any seeking)

Operating Systems - Unit 5 GNIT, Hyderabad.


Free Space Management in Operating System

Free space management


As we know that the memory space in the disk is limited. So we need to use the space of the
deleted files for the allocation of the new file. One optical disk allows only one write at a time in
the given sector and thus it is not physically possible to reuse it for other files. The system
maintains a free space list by keep track of the free disk space. The free space list contains all
the records of the free space disk block. The free blocks are those which are not allocated to
other file or directory. When we create a file we first search for the free space in the memory
and then check in the free space list for the required amount of space that we require for our
file. if the free space is available then allocate this space to the new file. After that, the allocating
space is deleted from the free space list. Whenever we delete a file then its free memory space
is added to the free space list.
The process of looking after and managing the free blocks of the disk is called free space
management. There are some methods or techniques to implement a free space list. These are
as follows:
∙ Bitmap
∙ Linked list
∙ Grouping
∙ Counting
1. Bitmap
This technique is used to implement the free space management. When the free space is
implemented as the bitmap or bit vector then each block of the disk is represented by a bit.
When the block is free its bit is set to 1 and when the block is allocated the bit is set to 0. The
main advantage of the bitmap is it is relatively simple and efficient in finding the first free block
and also the consecutive free block in the disk. Many computers provide the bit manipulation
instruction which is used by the users.
The calculation of the block number is done by the formula:
(number of bits per words) X (number of 0-value word) + Offset of first 1 bit For Example:
Apple Macintosh operating system uses the bitmap method to allocate the disk space.
Assume the following are free. Rest are allocated:

Operating Systems - Unit 5 GNIT, Hyderabad.


Advantages:
∙ This technique is relatively simple.
∙ This technique is very efficient to find the free space on the disk.
Disadvantages:
∙ This technique requires a special hardware support to find the first 1 in a word it is not 0. ∙
This technique is not useful for the larger disks.
For example: Consider a disk where blocks 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 17, 18, 25,26, and
27 are free and the rest of the blocks are allocated. The free-space bitmap would be:
001111001111110001100000011100000
2. Linked list
This is another technique for free space management. In this linked list of all the free block is
maintained. In this, there is a head pointer which points the first free block of the list which is
kept in a special location on the disk. This block contains the pointer to the next block and the
next block contain the pointer of another next and this process is repeated. By using this disk it
is not easy to search the free list. This technique is not sufficient to traverse the list because we
have to read each disk block that requires I/O time. So traversing in the free list is not a frequent
action.

Advantages:
∙ Whenever a file is to be allocated a free block, the operating system can simply allocate
the first block in free space list and move the head pointer to the next free block in the
list.
Disadvantages:

Operating Systems - Unit 5 GNIT, Hyderabad.


∙ Searching the free space list will be very time consuming; each block will have to be read
from the disk, which is read very slowly as compared to the main memory. ∙ Not Efficient for
faster access.
In our earlier example, we see that keep block 2 is the first free block which points to another
block which contains the pointer of the 3 blocks and 3 blocks contain the pointer to the 4 blocks
and this contains the pointer to the 5 block then 5 block contains the pointer to the next block
and this process is repeated at the last .
3. Grouping
This is also the technique of free space management. In this, there is a modification of the free
list approach which stores the address of the n free blocks. In this the first n-1 blocks are free
but the last block contains the address of the n blocks. When we use the standard linked list
approach the addresses of a large number of blocks can be found very quickly. In this
approach, we cannot keep a list of n free disk addresses but we keep the address of the first
free block.

4. Counting
Counting is another approach for free space management. Generally, some contiguous blocks
are allocated but some are free simultaneously. When the free space is allocated to a process
according to the contiguous allocation algorithm or clustering. So we cannot keep the list of n
free block address but we can keep the address of the first free block and then the numbers of n
free contiguous block which follows the first block. When there is an entry in the free space list it

Operating Systems - Unit 5 GNIT, Hyderabad.


consists the address of the disk and a count variable. This method of free space management is
similar to the method of allocating blocks. We can store these entries in the B-tree in place of
the linked list. So the operations like lookup, deletion, insertion are efficient.

Free space management in Operating


System
The system keeps tracks of the free disk blocks for allocating space to files when they are
created. Also, to reuse the space released from deleting the files, free space management
becomes crucial. The system maintains a free space list which keeps track of the disk
blocks that are not allocated to some file or directory. The free space list can be
implemented mainly as:
1. Bitmap or Bit vector –
A Bitmap or Bit Vector is series or collection of bits where each bit corresponds to a
disk block. The bit can take two values: 0 and 1: 0 indicates that the block is
allocated and 1 indicates a free block.
The given instance of disk blocks on the disk in Figure 1 (where green blocks are
allocated) can be represented by a bitmap of 16 bits as: 0000111000000110.

Operating Systems - Unit 5 GNIT, Hyderabad.


Advantages –
∙ Simple to understand.
∙ Finding the first free block is efficient. It requires scanning the words (a group of 8
bits) in a bitmap for a non-zero word. (A 0-valued word has all bits 0). The first free
block is then found by scanning for the first 1 bit in the non-zero word. The block
number can be calculated as:
(number of bits per word) *(number of 0-values words) + offset of bit first bit 1 in the
non-zero word .

For the Figure-1, we scan the bitmap sequentially for the first non-zero word. The
first group of 8 bits (00001110) constitute a non-zero word since all bits are not 0.
After the non-0 word is found, we look for the first 1 bit. This is the 5th bit of the
non-zero word. So, offset = 5.
Therefore, the first free block number = 8*0+5 = 5.
2. Linked List –
In this approach, the free disk blocks are linked together i.e. a free block contains a
pointer to the next free block. The block number of the very first disk block is stored
at a separate location on disk and is also cached in memory.

In Figure-2, the free space list head points to Block 5 which points to Block 6, the
next free block and so on. The last free block would contain a null pointer indicating
the end of free list.
A drawback of this method is the I/O required for free space list traversal.
Operating Systems - Unit 5 GNIT, Hyderabad.
3. Grouping –
This approach stores the address of the free blocks in the first free block. The first free
block stores the address of some, say n free blocks. Out of these n blocks, the first n-1
blocks are actually free and the last block contains the address of next free n blocks.
An advantage of this approach is that the addresses of a group of free disk blocks can
be found easily.
4. Counting –
This approach stores the address of the first free disk block and a number n of free
contiguous disk blocks that follow the first block.
Every entry in the list would contain:
1. Address of first free disk block
2. A number n
For example, in Figure-1, the first entry of the free space list would be: ([Address of
Block 5], 2), because 2 contiguous free blocks follow block 5.
Operating Systems - Unit 5 GNIT, Hyderabad.
Input-output system calls in C | Create, Open, Close, Read, Write

Important Terminology
What is the File Descriptor?
File descriptor is integer that uniquely identifies an open file of the process. File Descriptor
table: File descriptor table is the collection of integer array indices that are file descriptors in
which elements are pointers to file table entries. One unique file descriptors table is provided in
operating system for each process.
File Table Entry: File table entries is a structure In-memory surrogate for an open file, which is
created when process request to opens file and these entries maintains file position.
Standard File Descriptors: When any process starts, then that process file descriptors table’s
fd(file descriptor) 0, 1, 2 open automatically, (By default) each of these 3 fd references file table
entry for a file named /dev/tty
/dev/tty: In-memory surrogate for the terminal

Operating Systems - Unit 5 GNIT, Hyderabad.


Terminal: Combination keyboard/video screen
Read from stdin => read from fd 0 : Whenever we write any character from keyboard, it read
from stdin through fd 0 and save to file named /dev/tty.

Write to stdout => write to fd 1 : Whenever we see any output to the video screen, it’s from
the file named /dev/tty and written to stdout in screen through fd 1.

Write to stderr => write to fd 2 : We see any error to the video screen, it is also from that file
write to stderr in screen through fd 2.

Operating Systems - Unit 5 GNIT, Hyderabad.


I/O System calls
Basically there are total 5 types of I/O system calls:
create, open, close, write, read

1. Create: Used to create a new empty file.


Syntax in C language:
int creat(char *filename, mode_t mode)
Parameter:
∙ filename : name of the file which you want to create
∙mode : indicates permissions of new file.
Returns:
∙ return first unused file descriptor (generally 3 when first create use in process beacuse
0, 1, 2 fd are reserved)
∙ return -1 when error
How it work in OS
∙Create new empty file on disk
∙Create file table entry
∙Set first unused file descriptor to point to file table entry
∙Return file descriptor used, -1 upon failure
2. open: Used to Open the file for reading, writing or both.
Syntax in C language
#include<sys/types.h>
#include<sys/stat.h>
#include <fcntl.h>
int open (const char* Path, int flags [, int mode ]);
Parameters
∙Path : path to file which you want to use
∙ use absolute path begin with “/”, when you are not work in same directory of
file.
∙ Use relative path which is only file name with extension, when you are work
in same directory of file.
∙ flags : How you like to use
Operating Systems - Unit 5 GNIT, Hyderabad.
∙ O_RDONLY: read only, O_WRONLY: write only, O_RDWR: read and write,
O_CREAT: create file if it doesn’t exist, O_EXCL: prevent creation if it
already exists
How it works in OS
∙Find the existing file on disk
∙Create file table entry
∙Set first unused file descriptor to point to file table entry
∙Return file descriptor used, -1 upon failure

// C program to illustrate
// open system call
#include<stdio.h>
#include<fcntl.h>
#include<errno.h>
extern int errno;
int main()
{
// if file does not have in directory
// then file foo.txt is created.
int fd = open("foo.txt", O_RDONLY | O_CREAT);

printf("fd = %d/n", fd);


if (fd ==-1)
{
// print which type of error have in a code
printf("Error Number % d\n", errno);

// print program detail "Success or failure"


perror("Program");
}
return 0;
}

Output:
fd = 3

Operating Systems - Unit 5 GNIT, Hyderabad.


3. close: Tells the operating system you are done with a file descriptor and Close the file which
pointed by fd.
Syntax in C language
#include <fcntl.h>
int close(int fd);
Parameter:
∙ fd :file descriptor
Return:
∙0 on success.
∙ -1 on error.
How it works in the OS
∙Destroy file table entry referenced by element fd of file descriptor table – As long as no
other process is pointing to it!
∙Set element fd of file descriptor table to NULL

// C program to illustrate close system Call


#include<stdio.h>
#include <fcntl.h>
int main()
{
int fd1 = open("foo.txt", O_RDONLY);
if (fd1 < 0)
{
perror("c1");
exit(1);
}
printf("opened the fd = % d\n", fd1);
// Using close system Call
if (close(fd1) < 0)
{
perror("c1");
exit(1);
}
printf("closed the fd.\n");

Operating Systems - Unit 5 GNIT, Hyderabad.


}

Output:
opened the fd = 3
closed the fd.

// C program to illustrate close system Call


#include<stdio.h>
#include<fcntl.h>
int main()
{
// assume that foo.txt is already created
int fd1 = open("foo.txt", O_RDONLY, 0);
close(fd1);

// assume that baz.tzt is already created


int fd2 = open("baz.txt", O_RDONLY, 0);

printf("fd2 = % d\n", fd2);


exit(0);
}

Output:
fd2 = 3
Here, In this code first open() returns 3 because when main process created, then fd 0, 1, 2 are
already taken by stdin, stdout and stderr. So first unused file descriptor is 3 in file descriptor
table. After that in close() system call is free it this 3 file descriptor and then after set 3 file
descriptor as null. So when we called second open(), then first unused fd is also 3. So, output
of this program is 3.
Operating Systems - Unit 5 GNIT, Hyderabad.
4. read: From the file indicated by the file descriptor fd, the read() function reads cnt bytes of
input into the memory area indicated by buf. A successful read() updates the access time for the
file.
Syntax in C language
size_t read (int fd, void* buf, size_t cnt);
Parameters:
∙ fd: file descriptor
∙buf: buffer to read data from
∙cnt: length of buffer
Returns: How many bytes were actually read
∙ return Number of bytes read on success
∙ return 0 on reaching end of file
∙ return -1 on error
∙ return -1 on signal interrupt
Important points
∙buf needs to point to a valid memory location with length not smaller than the specified
size because of overflow.
∙ fd should be a valid file descriptor returned from open() to perform read operation
because if fd is NULL then read should generate error.
∙cnt is the requested number of bytes read, while the return value is the actual number of
bytes read. Also, some times read system call should read less bytes than cnt.

// C program to illustrate
// read system Call
#include<stdio.h>
#include <fcntl.h>
int main()
{
int fd, sz;
char *c = (char *) calloc(100, sizeof(char));

fd = open("foo.txt", O_RDONLY);
if (fd < 0) { perror("r1"); exit(1); }

Operating Systems - Unit 5 GNIT, Hyderabad.


sz = read(fd, c, 10);
printf("called read(% d, c, 10). returned that"
" %d bytes were read.\n", fd, sz);
c[sz] = '\0';
printf("Those bytes are as follows: % s\n", c);
}

Output:
called read(3, c, 10). returned that 10 bytes were read.
Those bytes are as follows: 0 0 0 foo.
Suppose that foobar.txt consists of the 6 ASCII characters “foobar”. Then what is the
output of the following program?

// C program to illustrate
// read system Call
#include<stdio.h>
#include<unistd.h>
#include<fcntl.h>
#include<stdlib.h>

int main()
{
char c;
int fd1 = open("sample.txt", O_RDONLY, 0);
int fd2 = open("sample.txt", O_RDONLY, 0);
read(fd1, &c, 1);
read(fd2, &c, 1);
printf("c = %c\n", c);
exit(0);
}

Output:
c=f

Operating Systems - Unit 5 GNIT, Hyderabad.


The descriptors fd1 and fd2 each have their own open file table entry, so each descriptor has its
own file position for foobar.txt. Thus, the read from fd2 reads the first byte of foobar.txt, and
the output is c = f, not c = o.
5. write: Writes cnt bytes from buf to the file or socket associated with fd. cnt should not be
greater than INT_MAX (defined in the limits.h header file). If cnt is zero, write() simply returns 0
without attempting any other action.
#include <fcntl.h>
size_t write (int fd, void* buf, size_t cnt);
Parameters:
∙ fd: file descriptor
∙buf: buffer to write data to
∙cnt: length of buffer
Returns: How many bytes were actually written
∙ return Number of bytes written on success
∙ return 0 on reaching end of file
∙ return -1 on error
∙ return -1 on signal interrupt
Important points
∙The file needs to be opened for write operations
∙buf needs to be at least as long as specified by cnt because if buf size less than the cnt
then buf will lead to the overflow condition.
∙cnt is the requested number of bytes to write, while the return value is the actual number
of bytes written. This happens when fd have a less number of bytes to write than cnt. ∙ If
write() is interrupted by a signal, the effect is one of the following: -If write() has not written
any data yet, it returns -1 and sets errno to EINTR. -If write() has successfully written
some data, it returns the number of bytes it wrote before it was interrupted.

Operating Systems - Unit 5 GNIT, Hyderabad.


// C program to illustrate
// write system Call
#include<stdio.h>
#include <fcntl.h>
main()
{
int sz;

int fd = open("foo.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);


if (fd < 0)
{
perror("r1");
exit(1);
}

sz = write(fd, "hello geeks\n", strlen("hello geeks\n"));

printf("called write(% d, \"hello geeks\\n\", %d)."


" It returned %d\n", fd, strlen("hello geeks\n"), sz);

close(fd);
}

Output:
called write(3, "hello geeks\n", 12). it returned 11
Here, when you see in the file foo.txt after running the code, you get a “hello geeks“. If foo.txt file
already have some content in it then write system call overwrite the content and all previous
content are deleted and only “hello geeks” content will have in the file.

Operating Systems - Unit 5 GNIT, Hyderabad.


Print “hello world” from the program without use any printf or cout function.

// C program to illustrate
// I/O system Calls
#include<stdio.h>
#include<string.h>
#include<unistd.h>
#include<fcntl.h>

int main (void)


{
int fd[2];
char buf1[12] = "hello world";
char buf2[12];

// assume foobar.txt is already created


fd[0] = open("foobar.txt", O_RDWR);
fd[1] = open("foobar.txt", O_RDWR);

write(fd[0], buf1, strlen(buf1));


write(1, buf2, read(fd[1], buf2, 12));

close(fd[0]);
close(fd[1]);
return 0;
}

Output:
hello world
In this code, buf1 array’s string “hello world” is first write in to stdin fd[0] then after that this
string write into stdin to buf2 array. After that write into buf2 array to the stdout and print output
“hello world“.

Operating Systems - Unit 5 GNIT, Hyderabad.

You might also like