0% found this document useful (0 votes)
4 views37 pages

Unit 8 - File System Interface and Implementation

Unit 8 of the Operating Systems course covers the file system interface and its implementation, detailing concepts such as file attributes, operations, access methods, and directory structures. It emphasizes the importance of efficient data organization in secondary storage and introduces various file types and allocation methods. The unit aims to equip students with the knowledge to manage files and directories effectively within an operating system.

Uploaded by

magaby2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views37 pages

Unit 8 - File System Interface and Implementation

Unit 8 of the Operating Systems course covers the file system interface and its implementation, detailing concepts such as file attributes, operations, access methods, and directory structures. It emphasizes the importance of efficient data organization in secondary storage and introduces various file types and allocation methods. The unit aims to equip students with the knowledge to manage files and directories effectively within an operating system.

Uploaded by

magaby2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

MASTER OF COMPUTER APPLICATIONS

O02CA503: Operating Systems

SEMESTER 1

O02CA503
OPERATING SYSTEMS
Unit: 8 – File System Interface and Implementation 1
O02CA503: Operating Systems

Unit 8
File System Interface and Implementation
TABLE OF CONTENTS
Fig No /
SL SAQ /
Topic Table / Page No
No Activity
Graph
1 Introduction - -
1.1 Objectives 4
- -

2 Concept of a File - 1
2.1 Attributes of a file - -
2.2 Operations on files - - 5-8
2.3 Types of files - -
2.4 Structure of file - -
3 File Access Methods - -
3.1 Sequential access - -
9 - 10
3.2 Direct access - -
3.3 Indexed sequential access - -
4 Directory Structure 1, 2 , 3, 2
4.1 Single-level directory - -
11 - 14
4.2 Two-level directory - -
4.3 Tree-structured directories - -
5 Allocation Methods 4, 5 , 6, 7 -
5.1 Contiguous allocation - -
5.2 Linked allocation - - 15 - 20
5.3 Indexed allocation - -
5.4 Performance comparison - -
6 Free Space Management - -
6.1 Bit vector - -
6.2 Linked list - - 21 - 22
6.3 Grouping - -
6.4 Counting - -
7 Directory Implementation - 3
7.1 Linear list - - 23 - 24
7.2 Hash table - -

Unit: 8 – File System Interface and Implementation 2


O02CA503: Operating Systems
8 Overview of Mass Storage Structure - -
8.1 Disk Structure - - 25 - 34
8.2 Disk Scheduling Algorithms - -
9 Summary - - 35
10 Terminal Questions - - 36
11 Answers - -
11.1 Self-Assessment Questions - - 37
11.2 Terminal Questions - -

Unit: 8 – File System Interface and Implementation 3


O02CA503: Operating Systems

1. INTRODUCTION
In the unit 8, we have discussed about virtual memory and various page replacement algorithms.
The operating system is a resource manager. Secondary resources like the disk are also to be
managed. Information is stored in secondary storage because it costs less, is non-volatile and
provides large storage space. Processes access data / information present on secondary storage
while in execution. Thus, the operating system has to properly organize data / information in
secondary storage for efficient access.

The file system is the most visible part of an operating system. It is a way for on-line storage and
access of both data and code of the operating system and the users. It resides on the secondary
storage because of the two main characteristics of secondary storage, namely, large storage
capacity and non-volatile nature. This unit will give you an overview of file system interface and its
implementation.

1.1. Objectives
After studying this unit, you should be able to:
explain various file concepts
discuss different file access methods
describe various directory structures
list out and explain various disk space allocation methods
manage free space on the disk effectively
implement directory

Unit: 8 – File System Interface and Implementation 4


O02CA503: Operating Systems

2. CONCEPT OF A FILE

Users use different storage media such as magnetic disks, tapes, optical disks and so on. All these
different storage media have their own way of storing information. The operating system provides
a uniform logical view of information stored in these different media. The operating system
abstracts from the physical properties of its storage devices to define a logical storage unit called
a file. These files are then mapped on to physical devices by the operating system during use.
The storage devices are usually non-volatile, meaning the contents stored in these devices persist
through power failures and system reboots.

The concept of a file is extremely general. A file is a collection of related information recorded on
the secondary storage. For example, file containing student information, file containing employee
information, files containing C source code and so on. A file is thus the smallest allotment of logical
secondary storage, that is any information to be stored on the secondary storage need to be
written on to a file and the file is to be stored. Information in files could be program code or data
in numeric, alphanumeric, alphabetic or binary form either formatted or in free form. A file is
therefore a collection of records if it is a data file or a collection of bits / bytes / lines if it is code.
Program code stored in files could be source code, object code or executable code whereas data
stored in files may consist of plain text, records pertaining to an application, images, sound and
so on. Depending on the contents of a file, each file has a pre-defined structure. For example, a
file containing text is a collection of characters organized as lines, paragraphs and pages whereas
a file containing source code is an organized collection of segments which in turn are organized
into declaration and executable statements.

2.1. Attributes of a file


A file has a name. The file name is a string of characters. For example, test.c, pay.cob, master.dat,
os.doc. In addition to a name, a file has certain other attributes. Important attributes among them
are:

• Type: Information on the type of file.


• Location: The location of the file on the device.
• Size: The current size of the file in bytes.

Unit: 8 – File System Interface and Implementation 5


O02CA503: Operating Systems

• Protection: Control information for user access.


• Time, date and user id: Information regarding when the file was created, last modified and last
used. This information is useful for protection, security and usage monitoring.

All these attributes of files are stored in a centralized place called the directory. The directory is
big if the numbers of files are many and also requires permanent storage.

2.2. Operations on files


A file is an abstract data type. Six basic operations are possible on files.

They are:

1) Creating a file: The two steps in file creation include space allocation for the file and an entry
to be made in the directory to record the name and location of the file.
2) Writing a file: The parameters required to write into a file are the name of the file and the
contents to be written into it. Given the name of the file the operating system makes a search
in the directory to find the location of the file. An updated write pointer enables to write the
contents at a proper location in the file.
3) Reading a file: To read information stored in a file the name of the file specified as a parameter
is searched by the operating system in the directory to locate the file. An updated read pointer
helps read information from a particular location in the file.
4) Repositioning within a file: A file is searched in the directory and a given new value replaces
the current file position. No I/O takes place. It is also known as files seek.
5) Deleting a file: The directory is searched for the particular file. If it is found, file space and other
resources associated with that file are released and the corresponding directory entry is
erased.
6) Truncating a file: In this the file attributes remain the same, but the file has a reduced size
because the user deletes information in the file. The end of file pointer is reset.

Other common operations are combinations of these basic operations. They include append,
rename and copy. A file on the system is very similar to a manual file. An operation on a file is
possible only if the file is open. After performing the operation, the file is closed. All the above
basic operations together with the open and close are provided by the operating system as system
calls.

Unit: 8 – File System Interface and Implementation 6


O02CA503: Operating Systems

2.3. Types of files


The operating system recognizes and supports different file types. The most common way of
implementing file types is to include the type of the file as part of the file name. The attribute ‘name’
of the file consists of two parts: a name and an extension separated by a period. The extension is
the part of a file name that identifies the type of the file. For example, in MS-DOS a file name can
be up to eight characters long followed by a period and then a three-character extension.
Executable files have a .com / .exe / .bat

extension, C source code files have a .c extension, COBOL source code files have a .cob
extension and so on.

If an operating system can recognize the type of a file then it can operate on the file quite well. For
example, an attempt to print an executable file should be aborted since it will produce only garbage.
Another use of file types is the capability of the operating system to automatically recompile the
latest version of source code to execute the latest modified program. This is observed in the Turbo
/ Borland integrated program development environment.

2.4. Structure of file


File types are an indication of the internal structure of a file. Some files even need to have a
structure that need to be understood by the operating system. For example, the structure of
executable files need to be known to the operating system so that it can be loaded in memory and
control transferred to the first instruction for execution to begin. Some operating systems also
support multiple file structures.

Operating system support for multiple file structures makes the operating system more complex.
Hence some operating systems support only a minimal number of files structures. A very good
example of this type of operating system is the UNIX operating system. UNIX treats each file as a
sequence of bytes. It is up to the application program to interpret a file. Here maximum flexibility
is present but support from operating system point of view is minimal. Irrespective of any file
structure support, every operating system must support at least an executable file structure to load
and execute programs.

Disk I/O is always in terms of blocks. A block is a physical unit of storage. Usually all blocks are
of same size. For example, each block = 512 bytes. Logical records have their own structure that

Unit: 8 – File System Interface and Implementation 7


O02CA503: Operating Systems

is very rarely an exact multiple of the physical block size. Therefore a number of logical records
are packed into one physical block. This helps the operating system to easily locate an offset
within a file. For example, as discussed above, UNIX treats files as a sequence of bytes. If each
physical block is say 512 bytes, then the operating system packs and unpacks 512 bytes of logical
records into physical blocks.

File access is always in terms of blocks. The logical size, physical size and packing technique
determine the number of logical records that can be packed into one physical block. The mapping
is usually done by the operating system. But since the total file size is not always an exact multiple
of the block size, the last physical block containing logical records is not full. Some part of this last
block is always wasted. On an average half a block is wasted. This is termed internal
fragmentation. Larger the physical block size, greater is the internal fragmentation. All file
systems do suffer from internal fragmentation. This is the penalty paid for easy file access by the
operating system in terms of blocks instead of bits or bytes.

SELF-ASSESSMENT QUESTIONS – 1

1. The operating system provides a uniform logical view of information stored in different
storage media. (True / False)
2. A _______________ is a collection of related information recorded on the secondary
storage.
3. Usually a block size will be _______________ bytes. (Pick the right option)
a) 512
b) 256
c) 128
d) 64

Unit: 8 – File System Interface and Implementation 8


O02CA503: Operating Systems

3. FILE ACCESS METHODS

Information is stored in files. Files reside on secondary storage. When this information is to be
used, it has to be accessed and brought into primary main memory. Information in files could be
accessed in many ways. It is usually dependent on an application. Access methods could be:-

• Sequential access
• Direct access
• Indexed sequential access

3.1. Sequential access


In this simple access method, information in a file is accessed sequentially one record after
another. To process the ith record all the i-1 records previous to i must be accessed. Sequential
access is based on the tape model that is inherently a sequential access device. Sequential
access is

best suited where most of the records in a file are to be processed. For example, transaction files.

3.2. Direct access


Sometimes it is not necessary to process every record in a file. It may not be necessary to process
records in the order in which they are present. Information present in a record of a file is to be
accessed only if some key value in that record is known. In all such cases, direct access is used.
Direct access is based on the disk that is a direct access device and allows random access of any
file block. Since a file is a collection of physical blocks, any block and hence the records in that
block are accessed. For example, master files. Databases are often of this type since they allow
query processing that involves immediate access to large amounts of information. All reservation
systems fall into this category. Not all operating systems support direct access files. Usually files
are to be defined as sequential or direct at the time of creation and accessed accordingly later.
Sequential access of a direct access file is possible but direct access of a sequential file is not.

Unit: 8 – File System Interface and Implementation 9


O02CA503: Operating Systems

3.3. Indexed sequential access


This access method is a slight modification of the direct access method. It is in fact a combination
of both the sequential access as well as direct access. The main concept is to access a file direct
first and then sequentially from that point onwards. This access method involves maintaining an
index. The index is a pointer to a block. To access a record in a file, a direct access of the index
is made. The information obtained from this access is used to access the file. For example, the
direct access to a file will give the block address and within the block the record is accessed
sequentially. Sometimes indexes may be big. So a hierarchy of indexes is built in which one direct
access of an index leads to info to access another index directly and so on till the actual file is
accessed sequentially for the particular record. The main advantage in this type of access is that
both direct and sequential access of files is possible.

Unit: 8 – File System Interface and Implementation 10


O02CA503: Operating Systems

4. DIRECTORY STRUCTURE

Files systems are very large. Files have to be organized. Usually a two level organization is done:

• The file system is divided into partitions. By default, there is at least one partition. Partitions
are nothing but virtual disks with each partition considered as a separate storage device.
• Each partition has information about the files in it. This information is nothing but a table of
contents. It is known as a directory.

The directory maintains information about the name, location, size and type of all files in the
partition. A directory has a logical structure. This is dependent on many factors including
operations that are to be performed on the directory like search for file/s, create a file, delete a file,
list a directory, rename a file and traverse a file system. For example, the dir, del, ren commands
in MS-DOS.

4.1. Single-level directory


This is a simple directory structure that is very easy to support. All files reside in one and the same
directory (See figure 8.1).

Figure 8.1: Single-level directory structure

A single-level directory has limitations as the number of files and users increase. Since there is
only one directory to list all the files, no two files can have the same name i.e., file names must be
unique in order to identify one file from another. Even with one user, it is difficult to maintain files
with unique names when the number of files becomes large.

Unit: 8 – File System Interface and Implementation 11


O02CA503: Operating Systems

4.2. Two-level directory


The main limitation of single-level directory is to have unique file names by different users. One
solution to the problem could be to create separate directories for each user.

A two-level directory structure has one directory exclusively for each user. The directory structure
of each user is similar in structure and maintains file information about files present in that directory
only. The operating system

has one master directory for a partition. This directory has entries for each of the user directories
(Refer figure 8.2).

Figure 8.2: Two-level directory structure

Files with same names exist across user directories but not in the same user directory. File
maintenance is easy. Users are isolated from one another. But when users work in a group and
each wants to access files in another user’s directory, it may not be possible.

Access to a file is through user name and file name. This is known as a path. Thus a path uniquely
defines a file. For example, in MS-DOS if ‘C’ is the partitions then C:\USER1\TEST,
C:\USER2\TEST, C:\USER3\C are all files in user directories. Files could be created, deleted,
searched and renamed in the user directories only.

4.3. Tree-structured directories


A tree structured directory is a tree of height two with the master file directory at the root having
user directories as descendants that in turn have the files themselves as descendants (Refer
figure 8.3). This generalization allows users to organize files within user directories into sub

Unit: 8 – File System Interface and Implementation 12


O02CA503: Operating Systems

directories. Every file has a unique path. Here the path is from the root through all the sub
directories to the specific file.

Figure 8.3: Tree-structured directory structure

Usually, the user has a current directory. User created sub directories could be traversed. Files
are usually accessed by giving their path names. Path names could be either absolute or relative.
Absolute path names begin with the root and give the complete path down to the file. Relative
path names begin with the current directory. Allowing users to define sub directories allows for
organizing user files based on topics. A directory is treated as yet another file in the directory,
higher up in the hierarchy. To delete a directory, it must be empty. Two options exist: delete all
files and then delete the directory or delete all entries in the directory when the directory is deleted.
Deletion may be a recursive process since directory to be deleted may contain sub directories.

Unit: 8 – File System Interface and Implementation 13


O02CA503: Operating Systems

SELF-ASSESSMENT QUESTIONS – 2

4. The CPU can directly process the information stored in secondary memory without
transferring it to main memory. (True / False)
5. In _______________ method, information in a file is accessed sequentially one
record after another.
6. A _______________ is a tree of height two with the master file directory at the root
having user directories as descendants that in turn have the files themselves as
descendants. (Pick the right option)
a) Single level directory
b) Tree structured directory
c) Two level directory
d) Directory

Unit: 8 – File System Interface and Implementation 14


O02CA503: Operating Systems

5. ALLOCATION METHODS

Allocation of disk space to files is a problem that looks at how effectively disk space is utilized and
quickly files can be accessed. The three major methods of disk space allocation are:

• Contiguous allocation
• Linked allocation
• Indexed allocation

5.1. Contiguous allocation


Contiguous allocation requires a file to occupy contiguous blocks on the disk. Because of this
constraint disk access time is reduced, as disk head movement is usually restricted to only one
track. Number of seeks for accessing contiguously allocated files is minimal and so also seek
times.

A file that is ‘n’ blocks long starting at a location ‘b’ on the disk occupies blocks b, b+1, b+2, …..,
b+(n-1). The directory entry for each contiguously allocated file gives the address of the starting
block and the length of the file in blocks as illustrated below (See figure 8.4).

Figure 8.4: Contiguous allocation

Accessing a contiguously allocated file is easy. Both sequential and random access of a file is
possible. If a sequential access of a file is made then the next block after the current is accessed,
whereas if a direct access is made then a direct block address to the ith block is calculated as b+i
where b is the starting block address.

Unit: 8 – File System Interface and Implementation 15


O02CA503: Operating Systems

A major disadvantage with contiguous allocation is to find contiguous space enough for the file.
From a set of free blocks, a first-fit or best-fit strategy is adopted to find ‘n’ contiguous holes for a
file of size ‘n’. But these algorithms suffer from external fragmentation. As disk space is allocated
and released, a single large hole of disk space is fragmented into smaller holes. Sometimes the
total size of all the holes put together is larger than the size of the file size that is to be allocated
space. But the file cannot be allocated in that space because there is no contiguous hole of size
equal to that of the file. This is when external fragmentation has occurred. Compaction of disk
space is a solution to external fragmentation. But it has a very large overhead.

Another problem with contiguous allocation is to determine the space needed for a file. The file is
a dynamic entity that grows and shrinks. If allocated space is just enough (a best-fit allocation
strategy is adopted) and if the file grows, there may not be space on either side of the file to
expand. The solution to this problem is to again reallocate the file into a bigger space and release
the existing space. Another solution that could be possible if the file size is known in advance is
to make an allocation for the known file size. But in this case there is always a possibility of a large
amount of internal fragmentation because initially the file may not occupy the entire space and
also grow very slowly.

5.2. Linked allocation


Linked allocation overcomes all problems of contiguous allocation. A file is allocated blocks of
physical storage in any order. A file is thus a list of blocks that are linked together. The directory
contains the address of the starting block and the ending block of the file. The first block contains
a pointer to the second, the second a pointer to the third and so on till the last block (Refer figure
8.5)

Unit: 8 – File System Interface and Implementation 16


O02CA503: Operating Systems

Figure 8.5: Linked allocation

Initially a block is allocated to a file, with the directory having this block as the start and end. As
the file grows, additional blocks are allocated with the current block containing a pointer to the
next and the end block being updated in the directory.

This allocation method does not suffer from external fragmentation because any free block can
satisfy a request. Hence there is no need for compaction. Moreover, a file can grow and shrink
without problems of allocation.

Linked allocation has some disadvantages. Random access of files is not possible. To access the
ith block access begins at the beginning of the file and follows the pointers in all the blocks till the
ith block is accessed. Therefore, access is always sequential. Also, some space in all the allocated
blocks is used for storing pointers. This is clearly an overhead as a fixed percentage from every
block is wasted. This problem is overcome by allocating blocks in clusters that are nothing but
groups of blocks. But this tends to increase internal fragmentation. Another problem in this
allocation scheme is that of scattered pointers. If for any reason a pointer is lost, then the file after
that block is inaccessible. A doubly linked block structure may solve the problem at the cost of
additional pointers to be maintained.

MS-DOS uses a variation of the linked allocation called a File Allocation Table (FAT). The FAT
resides on the disk and contains entry for each disk block and is indexed by block number. The
directory contains the starting

Unit: 8 – File System Interface and Implementation 17


O02CA503: Operating Systems

block address of the file. This block in the FAT has a pointer to the next block and so on till the
last block (Refer figure 8.6). Random access of files is possible because the FAT can be scanned
for a direct block address.

Figure 8.6: File allocation table

5.3. Indexed allocation


Problems of external fragmentation and size declaration present in contiguous allocation are
overcome in linked allocation. But in the absence of FAT, linked allocation does not support
random access of files since pointers hidden in blocks need to be accessed sequentially. Indexed
allocation solves this problem by bringing all pointers together into an index block. This also solves
the problem of scattered pointers in linked allocation. Each file has an index block. The address
of this index block finds an entry in the directory and contains only block addresses in the order in
which they are allocated to the file. The ith address in the index block is the ith block of the file
(Refer figure 8.7). Here both sequential and direct access of a file is possible. Also it does not
suffer from external fragmentation.

Unit: 8 – File System Interface and Implementation 18


O02CA503: Operating Systems

Figure 8.7: Indexed Allocation

Indexed allocation does suffer from wasted block space. Pointer overhead is more in indexed
allocation than in linked allocation. Every file needs an index block. Then what should be the size
of the index block? If it is too big, space is wasted. If it is too small, large files cannot be stored.
More than one index blocks are linked so that large files can be stored. Multilevel index blocks are
also used. A combined scheme having direct index blocks as well as linked index blocks has been
implemented in the UNIX operating system.

5.4. Performance comparison


All the three allocation methods differ in storage efficiency and block access time. Contiguous
allocation requires only one disk access to get a block, whether it is the next block (sequential) or
the ith block (direct). In the case of linked allocation, the address of the next block is available in
the current block being accessed and so is very much suited for sequential access. Hence direct
access files could use contiguous allocation and sequential access files could use linked allocation.
But if this is fixed then the type of access on a file needs to be declared at the time of file creation.
Thus, a sequential access file will be linked and cannot support direct access. On the other hand,
a direct access file will have contiguous allocation and can also support sequential access; the
constraint in this case is making known the file length at the time of file creation. The operating
system will then have to support algorithms and data structures for both allocation methods.

Unit: 8 – File System Interface and Implementation 19


O02CA503: Operating Systems

Conversion of one file type to another needs a copy operation to the desired file type.

Some systems support both contiguous and linked allocation. Initially all files have contiguous
allocation. As they grow a switch to indexed allocation takes place. If on an average, file are small,
then contiguous file allocation is advantageous and provides good performance

Unit: 8 – File System Interface and Implementation 20


O02CA503: Operating Systems

6. FREE SPACE MANAGEMENT

The disk is a scarce resource. Also, disk space can be reused. Free space present on the disk is
maintained by the operating system. Physical blocks that are free are listed in a free-space list.
When a file is created or a file grows, requests for blocks of disk space are checked in the free-
space list and then allocated. The list is updated accordingly. Similarly, freed blocks are added to
the free-space list. The free-space list could be implemented in many ways as follows:

6.1. Bit vector


A bit map or a bit vector is a very common way of implementing a free-space list. This vector ‘n’
number of bits where ‘n’ is the total number of available disk blocks. A free block has its
corresponding bit set (1) in the bit vector whereas an allocated block has its bit reset (0).

Illustration: If blocks 2, 4, 5, 9, 10, 12, 15, 18, 20, 22, 23, 24, 25, 29 are free and the rest are
allocated, then a free-space list implemented as a bit vector would look as shown below:

00101100011010010010101111000100000………

The advantage of this approach is that it is very simple to implement and efficient to access. If
only one free block is needed, then a search for the first ‘1’ in the vector is necessary. If a
contiguous allocation for ‘b’ blocks is required, then a contiguous run of ‘b’ number of 1’s is
searched. And if the first-fit scheme is used then the first such run is chosen, and the best of such
runs is chosen if best-fit scheme is used.

Bit vectors are inefficient if they are not in memory. Also, the size of the vector has to be updated
if the size of the disk changes.

6.2. Linked list


All free blocks are linked together. The free-space list head contains the address of the first free
block. This block in turn contains the address of the

next free block and so on. But this scheme works well for linked allocation. If contiguous allocation
is used then to search for ‘b’ contiguous free blocks calls for traversal of the free-space list which
is not efficient. The FAT in MS-DOS builds in free block accounting into the allocation data
structure itself where free blocks have an entry say –1 in the FAT.

Unit: 8 – File System Interface and Implementation 21


O02CA503: Operating Systems

6.3. Grouping
Another approach is to store ‘n’ free block addresses in the first free block. Here (n-1) blocks are
actually free. The last nth address is the address of a block that contains the next set of free block
addresses. This method has the advantage that a large number of free block addresses are
available at a single place unlike in the previous linked approach where free block addresses are
scattered.

6.4. Counting
If contiguous allocation is used and a file has freed its disk space then a contiguous set of ‘n’
blocks is free. Instead of storing the addresses of all these ‘n’ blocks in the free-space list, only
the starting free block address and a count of the number of blocks free from that address can be
stored. This is exactly what is done in this scheme where each entry in the free-space list is a disk
address followed by a count.

Unit: 8 – File System Interface and Implementation 22


O02CA503: Operating Systems

7. DIRECTORY IMPLEMENTATION

The two main methods of implementing a directory are:

• Linear list
• Hash table

7.1. Linear list


A linear list of file names with pointers to the data blocks is one way to implement a directory. A
linear search is necessary to find a particular file. The method is simple but the search is time
consuming. To create a file, a linear search is made to look for the existence of a file with the same
file name and if no such file is found the new file created is added to the directory at the end. To
delete a file, a linear search for the file name is made and if found allocated space is released.
Every time making a linear search consumes time and increases access time that is not desirable
since directory information is frequently used. A sorted list allows for a binary search that is time
efficient compared to the linear search. But maintaining a sorted list is an overhead especially
because of file creations and deletions.

7.2. Hash table


Another data structure for directory implementation is the hash table. A linear list is used to store
directory entries. A hash table takes a value computed from the file name and returns a pointer
to the file name in the linear list. Thus, search time is greatly reduced. Insertions that are prone to
collisions are resolved. The main problem is the hash function that is dependent on the hash table
size. A solution to the problem is to allow for chained overflow with each hash entry being a linked
list. Directory lookups in a hash table are faster than in a linear list.

Unit: 8 – File System Interface and Implementation 23


O02CA503: Operating Systems

SELF-ASSESSMENT QUESTIONS – 3

7. In contiguous allocation the disk access time is reduced, as disk head movement is
usually restricted to only one track. (True / False)
8. FAT stands for _______________.
9. If the file size is small, then _______________ is advantageous and provides good
performance. (Pick the right option)
a) Linked allocation
b) Indexed allocation
c) Contiguous file allocation
d) Linked with indexed allocation

Unit: 8 – File System Interface and Implementation 24


O02CA503: Operating Systems

8. OVERVIEW OF MASS STORAGE STRUCTURE


Mass storage structures refer to the organisation and management of large-scale data storage
systems, which are essential for retaining vast amounts of data over extended periods. These
systems typically encompass various types of storage devices, such as hard disk drives (HDDs),
solid-state drives (SSDs), magnetic tapes, and optical discs. In a computer system, mass storage
is managed through a hierarchy that prioritises speed, cost, and capacity, with faster, more
expensive storage (like cache and RAM) at the top and slower, larger, and more cost-effective
storage (like HDDs and SSDs) at the bottom. The file system, a crucial operating system
component, plays a key role in managing these storage resources, organising data, allocating disk
space, and maintaining file integrity. Advanced techniques like RAID (Redundant Array of
Independent Disks) are employed to improve performance and ensure data redundancy. Mass
storage systems are designed to provide a reliable and efficient way to store, retrieve, and manage
data, catering to the needs of various applications, from personal computing to enterprise-level
data centres.

8.1 Disk Structure

Figure: Disk Structure

Unit: 8 – File System Interface and Implementation 25


O02CA503: Operating Systems

The image shows the physical structure of hard disk drive (HDD), and the main components
involved in data storage and retrieval are:

1. Platters: These are the circular disks inside the HDD made of a rigid material, usually aluminum
or glass, coated with a magnetic material. Data is stored magnetically on these platters.

2. Read/Write Head: These are the tools that the drive uses to read data from and write data to
the platters. They "float" just above the surface of the platters on a cushion of air generated by
the spinning platters.

3. Track: This is a concentric circle on the surface of a platter. Tracks are sub-divided into sectors
and are the paths on which the read/write head reads and writes data.

4. Sector: These are subdivisions of a track and represent the smallest unit of storage on a disk.
A sector typically stores a fixed amount of data, such as 512 bytes or 4K bytes.

5. Cylinder: This is a concept used to describe the set of tracks that are at the same position on
each platter within the HDD. When the read/write heads move across the platters to access
data, they move in unison, so all the heads are over the same track number on each platter,
forming a cylinder.

6. Spindle: This is the axis on which the platters spin. It is driven by the motor, allowing the platters
to rotate at high speeds, which is essential for the read/write heads to access data.

7. Arm Assembly: This assembly holds the read/write heads and is mechanically moved in and
out to place the heads over the desired track on the platters. It is precisely controlled to allow
the heads to find and follow the tracks on the platters.

8.2 Disk Scheduling


Disk scheduling, also known as I/O scheduling, is the method by which the operating system
decides the order in which block I/O operations will be processed. The purpose of disk scheduling
is to optimize the way the system handles read and write requests to the disk.
The common disk scheduling algorithms are:
1. FCFS (First-Come, First-Served)
2. SSTF (Shortest Seek Time First)

Unit: 8 – File System Interface and Implementation 26


O02CA503: Operating Systems

3. SCAN (Elevator Algorithm)


4. C-SCAN (Circular SCAN)
5. LOOK and C-LOOK

1. FCFS (First-Come, First-Served)


The First-Come, First-Served (FCFS) disk scheduling algorithm is one of the simplest forms of
disk I/O scheduling. This algorithm services the disk I/O requests in the exact order they arrive in
the disk queue.
The main advantage of this method is its simplicity and fairness, as no request is given priority
over another. However, it can be inefficient as it doesn't consider the current position of the disk
head or the physical layout of the disk.
Imagine a disk with 200 tracks numbered 0 to 199. The disk head is currently at track 53. We have
a queue of disk access requests for tracks in the following order: 98,183,37,122,14,124,65,67
The FCFS algorithm would service these requests in the exact order they are listed. Here's the
order of servicing and the movement of the disk head:

• Current position is 53. The first request is for track 98 i.e., The disk head moves from 53 to
98.
• Next, move to track 183. The disk head moves from 98 to 183.
• Then, move to track 37. The disk head moves from 183 to 37.
• Next, move to track 122. The disk head moves from 37 to 122.
• Move to track 14. The disk head moves from 122 to 14.
• Then, to track 124. The disk head moves from 14 to 124.
• Next, to track 65. The disk head moves from 124 to 65.
• Finally, to track 67. The disk head moves from 65 to 67.

The total head movement in this example can be calculated by summing the distances moved
between each of these requests. This total head movement is used as a metric to determine the
efficiency of the disk scheduling algorithm. The FCFS algorithm does not minimize this head
movement, and thus, it may not be the most efficient algorithm, especially in a system with a large
number of disk I/O requests.

Unit: 8 – File System Interface and Implementation 27


O02CA503: Operating Systems

Starting from track 53, the head movements will be calculated as follows:
1. Move from 53 to 98: ∣98−53∣=45 tracks
2. Move from 98 to 183: ∣183−98∣=85 tracks
3. Move from 183 to 37: ∣183−37∣=146 tracks
4. Move from 37 to 122: ∣122−37∣=85tracks
5. Move from 122 to 14: ∣122−14∣=108 tracks
6. Move from 14 to 124: ∣124−14∣=110 tracks
7. Move from 124 to 65: ∣124−65∣=59 tracks
8. Move from 65 to 67: ∣67−65∣=2 tracks

Now, we add up all these movements to get the total head movement:
Total Head Movement = 45 + 85 + 146 + 85 + 108 + 110 + 59 + 2 = 640 tracks

2. SSTF (Shortest Seek Time First)


Sure, SSTF (Shortest Seek Time First) is a disk scheduling algorithm that selects the disk I/O
request that requires the least movement of the disk arm from its current position, regardless of
the direction. This minimises the total seek time needed for a series of disk I/O requests.

Unit: 8 – File System Interface and Implementation 28


O02CA503: Operating Systems

Example: Imagine a disk with 200 tracks numbered 0 to 199. The disk head is currently at track
53. We have a queue of disk access requests for tracks in the following order:
98,183,37,122,14,124,65,67 To visualise the SSTF disk scheduling algorithm based on the given
queue of disk access requests and starting at track 53, the disk head will move in the following
order of tracks:
• First, it moves to track 65, as it is closest to the starting track 53.
• Next, it goes to track 67, followed by 37, which are the nearest to the current positions.
• After that, the head moves to track 14 and 98.
• Continuing, it accesses tracks 122 and 124 in sequence.
• Finally, it serves the request at track 183.

This order ensures the shortest seek time first, as at each step, the disk head moves to the nearest
track with a pending request.

To calculate the total head movement in the Shortest Seek Time First (SSTF) scheduling, you
follow the head as it moves to the nearest track request at each step. Here's how it works step by
step:

1. The head starts at track 53.

Unit: 8 – File System Interface and Implementation 29


O02CA503: Operating Systems

2. The nearest request is at track 65, so it moves to 65. This is a movement of ∣65−53∣=12
tracks.

3. The next nearest request is at track 67, moving from 65. This movement is ∣67−65∣=2 tracks.

4. Then it moves to track 37, which is ∣67−37∣=30 tracks away.

5. Next, it moves to track 14, moving ∣37−14∣=23 tracks.

6. It then goes to track 98, moving ∣98−14∣=84 tracks.

7. From 98, it moves to 122, which is ∣122−98∣=24 tracks.

8. Finally, it moves to track 124, just ∣124−122∣=2 tracks, and then to the furthest request at
track 183, which is ∣183−124∣=59 tracks.

Add up all the individual movements to get the total head movement: 12+2+30+23+84+24+2+59
= 236 tracks.

3. SCAN
SCAN disk scheduling, also known as the elevator algorithm, is a method the operating system
uses to schedule disk I/O requests.
It works like:
✓ The disk arm starts at a particular track and moves in a direction towards one end of the disk.
✓ As it moves, it services all the requests (reads or writes) that come along that path.
✓ Upon reaching the end of the disk or the last request in the direction it is moving, the disk arm
reverses its direction.
✓ The arm then services the remaining requests in the new direction, again servicing all requests
encountered along this path.

The name "elevator algorithm" comes from the similarity of the disk arm movement to an elevator
in a building, which travels in one direction servicing all the floors (requests) on its path until it
reaches the top or bottom and then reverses direction.

Unit: 8 – File System Interface and Implementation 30


O02CA503: Operating Systems

The total head movement is calculated as:


= (53 – 37) + (37 – 14) + (14 – 0) + (65 – 0) + (67 – 65) + (98 – 67) + (122 – 98) + (124 – 122) +
(183 – 124)
= 16 + 23 + 14 + 65 + 2 + 31 +24 + 2 + 59
= 236
The advantages of SCAN scheduling are:
• It provides a more uniform wait time than First-Come-First-Served (FCFS) or Shortest Seek
Time First (SSTF).

• It prevents the "starvation" of requests, as it will eventually service all tracks on its path.

4. C-SCAN (Circular-SCAN)
C-SCAN disk scheduling, also known as Circular SCAN, the disk arm moves in a single direction
and services all the requests until it reaches the other end of the disk. After reaching the end, it
immediately returns to the beginning of the disk, without servicing any requests on the return trip,

Unit: 8 – File System Interface and Implementation 31


O02CA503: Operating Systems

and starts servicing requests in the same direction again. This provides a more uniform wait time
compared to SCAN.
Example: Suppose we have a disk with 200 tracks numbered from 0 to 199. The head of the
disk is initially at track number 53, and the requests arrive in the following order: 98, 183, 37,
122, 14, 124, 65, 67. The direction of head movement is from track 0 to track 199.
Solution:
• The disk head starts at track 53 and moves towards the higher numbered tracks.
• Services requests at 65, 67, 98, 122, 124, and 183 in that order as it moves in the direction
of increasing track numbers.
• Once it reaches the highest request (or the end of the disk), it jumps back to the lowest
track number without servicing any requests.
• It then services the remaining requests starting from the lowest track number going
upwards.

Unit: 8 – File System Interface and Implementation 32


O02CA503: Operating Systems

The total head Movements:


= (65 – 53) + (67 – 65) + (98 – 67) + (122 – 98) + (124 – 122) + (183 – 124) + (199 – 183) + (199
– 0) + ( 14 – 0) + (37 – 14)
= 12 + 2+ 31 + 24 + 2 + 59 + 16 + 199 + 14 + 23
= 382
5. Look
LOOK is a disk scheduling algorithm that is a variant of the SCAN algorithm. Unlike SCAN, the
LOOK algorithm "looks" ahead to see if there are any requests in the direction it is currently moving
and only travels as far as the last request in that direction. After servicing the furthest request, it
reverses direction and again services the requests as it "looks" for the next furthest request. This
way, the disk arm doesn't always move to the end of the disk if there are no requests to service,
which can reduce the average seek time.
Example: Suppose we have a disk with 200 tracks numbered from 0 to 199. The head of the
disk is initially at track number 53, and the requests arrive in the following order: 98, 183, 37,
122, 14, 124, 65, 67. The direction of head movement is from track 0 to track 199.

The total head movement for the SCAN Disk Scheduling algorithm, based on the provided
sequence of requests and starting from track 53, is 208 tracks.
6. C-LOOK

Unit: 8 – File System Interface and Implementation 33


O02CA503: Operating Systems

The Circular-LOOK (C-LOOK) algorithm is a variant of LOOK. With C-LOOK, the disk arm again
starts at one end and moves towards the other end, servicing requests. However, unlike LOOK,
when the arm reaches the furthest request in one direction, it doesn't service any requests on its
return to the start; it jumps directly to the first request in the queue, effectively looping around like
a circle. C-LOOK, therefore, always moves in the same direction and doesn't reverse as LOOK
does.
Example: Suppose we have a disk with 200 tracks numbered from 0 to 199. The head of the
disk is initially at track number 53, and the requests arrive in the following order: 98, 183, 37,
122, 14, 124, 65, 67. The direction of head movement is from track 0 to track 199.

The total head movements incurred while servicing these requests is 322.

Unit: 8 – File System Interface and Implementation 34


O02CA503: Operating Systems

9. SUMMARY

Let’s summarize the important points you learnt in this unit:

• The operating system is a secondary resource manager.


• Data / information stored in secondary storage have to be managed and efficiently accessed
by executing processes. To do this the operating system uses the concept of a file.
• A file is the smallest allotment of secondary storage. Any information to be stored needs to be
written on to a file.
• There are three methods of file access viz. Sequential, Direct and Indexed sequential.
• Contiguous allocation, linked allocation and indexed allocation are various space allocation
methods.
• Directory Implementation in a File System refers to how an operating system's file system
organises, manages, and maintains the structure of files on storage devices and the
operations supported by directory implementation like creating, deleting, listing files, renaming,
and Transferring the file systems.
• Directory Implementation Methods like Linear list and Hash table with their drawbacks.
• The disk structure which refers to the physical and logical layout of a hard disk drive (HDD) or
solid-state drive (SSD), which are the primary media for storing and retrieving digital
information.and disk scheduling algorithms like FCFS, SSTF, SCAN, C- SCAN, LOOK
algorithms.

Unit: 8 – File System Interface and Implementation 35


O02CA503: Operating Systems

10. TERMINAL QUESTIONS


1. Explain different file access methods in detail.
2. What are the different directory structures available? Explain.
3. Discuss various space allocation methods for files.

Unit: 8 – File System Interface and Implementation 36


O02CA503: Operating Systems

11. ANSWERS

10.1. Self-Assessment Questions


1. True
2. File
3. 512
4. False
5. Sequential Access
6. Tree Structured Directory
7. True
8. File Allocation Table
9. Contiguous File Allocation

10.2. Terminal Questions


Answer 1: Information in files could be accessed in many ways. It is usually dependent on an
application. Access methods could be:-

a. Sequential access

b. Direct access

c. Indexed sequential access (Refer Section 3)

Answer 2: The directory maintains information about the name, location, size and type of all files
in the partition. A directory has a logical structure. This is dependent on many factors including
operations that are to be performed on the directory like search for file/s, create a file, delete a file,
list a directory, rename a file and traverse a file system. (Refer Section 4)

Answer 3: Allocation of disk space to files is a problem that looks at how effectively disk space is
utilized and quickly files can be accessed. The three major methods of disk space allocation are:

a. Contiguous allocation

b. Linked allocation

c. Indexed allocation (Refer Section 5)

Unit: 8 – File System Interface and Implementation 37

You might also like