0% found this document useful (0 votes)
14 views68 pages

Module 5

The document discusses file-system interfaces, including concepts, access methods, and directory structures. It explains file attributes, operations on files, and various file access methods such as sequential, direct, and indexed access. Additionally, it covers directory structures, their implementations, and disk allocation methods, highlighting the advantages and disadvantages of each approach.

Uploaded by

Shashank S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views68 pages

Module 5

The document discusses file-system interfaces, including concepts, access methods, and directory structures. It explains file attributes, operations on files, and various file access methods such as sequential, direct, and indexed access. Additionally, it covers directory structures, their implementations, and disk allocation methods, highlighting the advantages and disadvantages of each approach.

Uploaded by

Shashank S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

MODULE 5

FILE-SYSTEM INTERFACE: Concepts, Access Methods, Directory and Disk


Structure. File-System Structure Protection: Implementing File system:
File system structure; File system implementation; Directory
implementation; Allocation methods; Free space management.
FILE-SYSTEM
• What is a File ?
• A file can be defined as a data structure which
stores the sequence of records.
• Files are stored in a file system, which may exist on
a disk or in the main memory. Files can be simple
(plain text) or complex (specially-formatted).
• The collection of files is known as Directory. The
collection of directories at the different levels, is
known as File System
Attributes of the File
• 1.Name
• Every file carries a name by which the file is recognized in the file
system. One directory cannot have two files with the same name.
• 2.Identifier
• Along with the name, Each File has its own extension which identifies
the type of the file. For example, a text file has the extension .txt, A
video file can have the extension .mp4.
• 3.Type
• In a File System, the Files are classified in different types such as video
files, audio files, text files, executable files, etc.
4.Location
In the File System, there are several locations on which, the files can be
stored. Each file carries its location as its attribute
5.Size
• The Size of the File is one of its most important attribute. By size of the file,
we mean the number of bytes acquired by the file in the memory.
6.Protection
• The Admin of the computer may want the different protections for the
different files. Therefore each file carries its own set of permissions to the
different group of Users.
7.Time and Date
• Every file carries a time stamp which contains the time and date on which
the file is last modified.
.
Operations on the File
• A file is a collection of logically related data that is recorded on the
secondary storage in the form of sequence of operations. The content
of the files are defined by its creator who is creating the file. The
various operations which can be implemented on a file such as read,
write, open and close etc. are called file operations
• 1.Create operation:
• Two steps are necessary to create a file. First, space in the file system
must be found for the file
• Second, an entry for the new file must be made in the directory.
• Writing a file:
• To write a file, we make a system call specifying both the name of the
file and the information to be written to the file
• The system must keep a write pointer to the location in the file where
the next write is to take place. The write pointer must be updated
whenever a write occurs.
• Reading a file.
• To read from a file, we use a system call that specifies the name of the
file and where (in memory) the next block of the file should be put.
• system needs to keep a read pointer to the location in the file where
the next read is to take place. Once the read has taken place, the read
pointer is updated
• Repositioning within a file
• The directory is searched for the appropriate entry, and the current-
file-position pointer is repositioned to a given value.
• Deleting a file
• To delete a file, we search the directory for the named file. Having
found the associated directory entry, we release all file space, so that
it can be reused by other files, and erase the directory entry
• Truncating a file
• The user may want to erase the contents of a file but keep its
attributes. Rather than forcing the user to delete the file and then
recreate it, this function allows all attributes to remain unchanged -
except for file length
File Access Methods
• File access methods in an operating system are the techniques and
processes used to read from and write to files stored on a computer’s
storage devices.
• There are three ways to access a file in a computer system:
• Sequential-Access
• Direct Access
• Index sequential Method
Sequential Access
• The simplest access method is Sequential Access . Information in the
file is processed in order, one record after the other. This mode of
access is by far the most common; for example, editors and compilers
usually access files in this fashion.
• Reads and writes make up the bulk of the operations on a file. A read
operation- read next- reads the next portion of the file and
automatically advances a file pointer, which tracks the I/O location.
• Similarly, the write operation- write next -appends to the end of the
file and advances to the end of the newly written material
• The operating system reads the file word by word in a sequential
access method of file accessing.
• A pointer is made, which first links to the file's base address. If the
user wishes to read the first word of the file, the pointer gives it to
them and raises its value to the next word. This procedure continues
till the file is finished.
• The data in the file is evaluated in the order that it appears in the file
and that is why it is easy and simple to access a file's data using a
sequential access mechanism.
Advantages of Sequential Access:
•The sequential access mechanism is very easy to implement.
•It uses lexicographic order to enable quick access to the next entry.
Disadvantages of Sequential Access:
•Sequential access will become slow if the next file record to be retrieved is not present next
to the currently pointed record.
•Adding a new record may need relocating a significant number of records of the file.
quiz
2. Direct (or Relative) Access
• A file is made up of fixed length logical records that allow programs to
read and write records rapidly in no particular order
• The direct-access method is based on a disk model of a file, since
disks allow random access to any file block
• For direct access, the file is viewed as a numbered sequence of blocks
or records. Thus, we may read block 14, then read block 53, and then
write block 7. There are no restrictions on the order of reading or
writing for a direct-access file.
• Direct-access files are of great use for immediate access to large
amounts of information.
• For example, in a database application, we may need to quickly
retrieve customer data based on a specific customer ID. Direct file
access can quickly access the record containing the customer data
without having to read through all the records that come before it.
• For the direct-access method, the file operations must be modified to
include the block number as a parameter. Thus, we have read n,
where n is the block number, rather than read next, and ·write n
rather than write next .where n is the block number.
• Then, to effect a read n, we would position to n and then read next
Advantages of random access
• Random access provides fast and efficient access to specific data within the file.
• It is also efficient for editing and updating data in the file

Disadvantages of random access


• Random access requires more memory to store index or address information, which
can make the file size larger than with sequential access.
• If the index or address information becomes corrupted, data can become
inaccessible.
Indexed File Access
• The index like an index in the back of book , contains pointers to the
various blocks. To find a record in the file, we first search the index
and then use the to access the file directly and to find the desired
record.
• The indexed access method involves accessing files through an index
or directory that contains a list of file names and their corresponding
locations on the disk.
• The indexed access method provides a fast and efficient way to locate
and access files.
• The indexed access method uses a file index or directory to keep track
of the locations of files on the disk.
• The file index is stored in a separate file or in a specific location on the
disk. When a file is created, its name and location are added to the
file index.
•Advantages − Provides fast and efficient access to files by name or
attributes, making it suitable for applications that require searching and
retrieving specific files quickly.
•Disadvantages − The index must be maintained, which can require
additional disk space and processing time.
Directory and Disk Structure
• Files are stored on random-access storage devices, including hard
disks, optical disks, and solid state (memory-based) disks.
• A storage device can be used in its entirety for a file system. It can
also be subdivided for finer-grained control. For example, a disk can
be partitioned into quarters, and each quarter can hold a file system
• Directory can be defined as the listing of the related files on the disk.
The directory may store some or the entire file attributes.
• To get the benefit of different file systems on the different operating
systems, A hard disk can be divided into the number of partitions of
different sizes. The partitions are also called volumes or mini disks.
• Each partition must have at least one directory in which, all the files
of the partition can be listed. A directory entry is maintained for each
file in the directory which stores all the information related to that
file.

A directory can be viewed as a file which contains the


Meta data of the bunch of files.
Structures of Directory in Operating System
• A directory is a container that is used to contain folders and files. It
organizes files and folders in a hierarchical manner. In other words,
directories are like folders that help organize files on a computer.

Different Types of Directory in OS

•Single-Level Directory
•Two-Level Directory
•Tree Structure/ Hierarchical Structure
•Acyclic Graph Structure
•General-Graph Directory Structure
• Single-level Directory
• The simplest directory structure is the single-level directory. All files
are contained in the same directory, which is easy to support and
understand
• The directory contains one entry per each file present on the file
system.
• A single level directory has a significant limitation, however, when the
number of files increases or when the system has more than one user
• Since all the files are in the same directory, they must have a unique
name. If two users call their dataset test, then the unique name rule
violated.
Advantages

1.Implementation is very simple.


2.If the sizes of the files are very small then the searching becomes faster.
3.File creation, searching, deletion is very simple since we have only one directory.

Disadvantages

1.We cannot have two files with the same name.


2.The directory may be very big therefore searching for a file may take so much time.
3.There are no ways to group same kind of files.
Two Level Directory
• A single level directory often
leads to confusion of files
names among different users.
The solution to this problem is
to create a separate directory
for each user.
• In the two-level directory
structure, each user has their
own user files directory
(UFD). The UFDs have similar
structures, but each lists only
the files of a single user.
System’s master file directory
(MFD) is searched whenever a
new user id is created.
• Advantages
• The main advantage is there can be more than two files with same
name, and would be very helpful if there are multiple users.
• A security would be there which would prevent user to access other
user’s files.
• Searching of the files becomes very easy in this directory structure.
• Disadvantages
• As there is advantage of security, there is also disadvantage that the
user cannot share the file with the other users
• users can create their own files, users don’t have the ability to create
subdirectories.
Tree Structure/ Hierarchical Structure
This directory structure resembles a real tree upside down, where the root directory is at the
peak.
The users can create subdirectories and even store files in their directory.
The structure of tree directory is given below which shows how there are files and subdirectories in
each user’s directory.
• Advantages
• This directory structure allows subdirectories inside a directory.
• The searching is easier.
• File sorting of important and unimportant becomes easier.
• Disadvantages
• Users cannot modify the root directory data.
• As the user has the capability to make subdirectories, if the number of
subdirectories increase the searching may become complicated.
Acyclic Graph Structure
• The tree structured directory system
doesn't allow the same file to exist in
multiple directories therefore
sharing is major concern in tree
structured directory system
• where a file is shared between
multiple users. If any user makes a
change, it would be reflected to both
the users.
• In this system, two or more
directory entry can point to the same
file or sub directory. That file or sub
directory is shared between the two
directory entries.
• Advantages
• Sharing of files and directories is allowed between multiple users.
• Searching becomes too easy.
• Flexibility is increased as file sharing and editing access is there for
multiple users.
• Disadvantages
• The user must be very cautious to edit or even deletion of file as the
file is accessed by multiple users.
• If we need to delete the file, then we need to delete all the references
of the file inorder to delete it permanently.
General-Graph Directory Structure
• A serious problem with using an acyclic-graph structure is ensuring
that there are no cycles
• The general-graph directory can have cycles, meaning a directory can
contain paths that loop back to the starting point.

Cycle is formed in the User 2 directory.


While this structure offers more flexibility, it
is also more complicated to implement.
• Advantages of General-Graph Directory
• More flexible than other directory structures.
• Allows cycles, meaning directories can loop back to each other.
• Disadvantages of General-Graph Directory
• More expensive to implement compared to other solutions.
• Requires garbage collection to manage and clean up unused files and
directories.
Directory implementation
• Directory implementation in the operating system can be done using
Singly Linked List and Hash table
• Directory Implementation using Singly Linked List
• In this algorithm, all the files in a directory are maintained as singly
lined list. Each file contains the pointers to the data blocks which are
assigned to it and the next file in the directory.
• When a new file is created, then the entire list is checked whether the
new file name is matching to a existing file name or not. In case, it
doesn't exist, the file can be created at the beginning or at the end.
Therefore, searching for a unique name is a big concern because
traversing the whole list takes time.
• The list needs to be traversed in case of every operation (creation,
deletion, updating, etc) on the files therefore the systems become
inefficient.
• 2. Hash Table
To overcome the drawbacks of singly linked
list implementation of directories, there is
an alternative approach that is hash table.
This approach suggests to use hash table
along with the linked lists.
A key-value pair for each file in the
directory gets generated and stored in the
hash table. The key can be determined by
applying the hash function on the file name
while the key points to the corresponding
file stored in the directory.
Now, searching becomes efficient due to
the fact that now, entire list will not be
searched on every operating. Only hash
table entries are checked using the key and
if an entry found then the corresponding file
will be fetched using the value.
Allocation Methods

• The allocation methods define how the files are stored in the
disk blocks. There are three main disk space or file allocation
methods.
• Contiguous Allocation
• Linked Allocation
• Indexed Allocation
• The main idea behind these methods is to provide:
• Efficient disk space utilization.
• Fast access to the file blocks.
1. Contiguous Allocation
• In this scheme, each file occupies a contiguous set of blocks on the
disk.
• For example, if a file requires n blocks and is given a block b as the
starting location, then the blocks assigned to the file will be: b, b+1,
b+2,……b+n-1.
• The directory entry for a file with contiguous allocation contains
• Address of starting block
• Length of the allocated portion.
• The file ‘mail’ in the following figure starts from the block 19 with
length = 6 blocks. Therefore, it occupies 19, 20, 21, 22, 23, 24 blocks.
• Advantages:
•Both the Sequential and Direct Accesses are supported by this. For direct
access, the address of the kth block of the file which starts at block b can
easily be obtained as (b+k).
•This is extremely fast since the number of seeks are minimal because of
contiguous allocation of file blocks.
•Disadvantages:
•Increasing file size is difficult because it depends on the availability of contiguous
memory at a particular instance.
•This method suffers from both internal and external fragmentation. This makes it
inefficient in terms of memory utilization.
2. Linked List Allocation
• In this scheme, each file is a linked list of disk blocks which need not
be contiguous. The disk blocks can be scattered anywhere on the
disk.
• The directory entry contains a pointer to the starting and the ending
file block. Each block contains a pointer to the next block occupied by
the file.
• The file ‘jeep’ in following image shows how the blocks are randomly
distributed. The last block (25) contains -1 indicating a null pointer
and does not point to any other block.
• Advantages:
• This is very flexible in terms of file size. File size can be increased easily
since the system does not have to look for a contiguous chunk of memory.
• This method does not suffer from external fragmentation. This makes it
relatively better in terms of memory utilization.
• Disadvantages:
• Because the file blocks are distributed randomly on the disk, a large
number of seeks are needed to access every block individually. This makes
linked allocation slower.
• It does not support random or direct access. We can not directly access the
blocks of a file. A block k of a file can be accessed by traversing k blocks
sequentially (sequential access ) from the starting block of the file via block
pointers.
• Pointers required in the linked allocation incur some extra overhead.
• 3. Indexed Allocation
• In this scheme, a special
block known as the Index
block contains the pointers
to all the blocks occupied by
a file

Indexed block doesn't hold the


file data, but it holds the
pointers to all the disk blocks
allocated to that particular file.
Directory entry will only contain
the index block address.
• Advantages:
• This supports direct access to the blocks occupied by the file and
therefore provides fast access to the file blocks.
• It overcomes the problem of external fragmentation.
• Disadvantages:
• The pointer overhead for indexed allocation is greater than linked
allocation.
• Having an index block for a small file is totally wastage.
File-System Protection
• In computer systems, a lot of user’s information is stored, the
objective of the operating system is to keep safe the data of the user
from the improper access to the system.
• Protection can be provided in number of ways. For a single laptop
system, we might provide protection by locking the computer in a
desk drawer or file cabinet. For multi-user systems, different
mechanisms are used for the protection.
Protection
• Types of Access
• The files which have direct access of the any user have the need of
protection
• The files which are not accessible to other users doesn’t require any
kind of protection.
• The mechanism of the protection provide the facility of the controlled
access by just limiting the types of access to the file.
• Access is permitted or denied depending on several factors, one of
which is the type of access requested. Several different types of
operations may be controlled
Operations like renaming, editing the existing
file, copying; these can also be controlled.
Protection
• Access Control
• The most common approach to the protection problem is to make
access dependent on the identity of the user.
• There are different methods used by different users to access any file.
• The general way of protection is to associate identity-dependent
access with all the files and directories an list called Access control
list (ACL). which specify the names of the users and the types of
access associate with each of the user
• The main problem with the access list is their length. If we want to
allow everyone to read a file, we must list all the users with the read
access. This technique has two undesirable consequences:
• Two undesirable consequences:
• Constructing such a list may be tedious and unrewarding task, especially if we
do not know in advance the list of the users in the system.
• Previously, the entry of the any directory is of the fixed size but now it
changes to the variable size which results in the complicates space
management.
• These problems can be resolved by use of a condensed version of the
access list.
• To condense the length of the access-control list, many systems
recognize three classification of users in connection with each file:
• A sample directory listing from a UNIX environment is shown in Figure
The first field describes the protection of the file or directory. A d as the first
character indicates a subdirectory. Also shown are the number of links to the file,
the owner's name, the group's name, the size of the file in bytes, the date of last
modification, and finally the file's name.
File System structure
• Disks provide the bulk of secondary storage on which a file system is
maintained. They have two characteristics that make them a
convenient medium for storing multiple files:
• File systems provide efficient and convenient access to the
disk by allowing data to be stored, located, and retrieved
easily.
• A file system poses two quite different design problems.
• The first problem is defining how the file system should
look to the user. This task involves defining a file and its
attributes, the operations allowed on a file, and the
directory structure for organizing files.
• The second problem is creating algorithms and data
structures to map the logical file system onto the physical
secondary-storage devices.
• Layered File System
• When an application program asks for a file, the first
request is directed to the logical file system.
• The logical file system contains the Meta data of the
file and directory structure. If the application program
doesn't have the required permissions of the file then
this layer will throw an error. Logical file systems also
verify the path to the file.
• Files are divided into various logical blocks. Files are to
be stored in the hard disk and to be retrieved from the
hard disk. Hard disk is divided into various tracks and
sectors. In order to store and retrieve the files, the
logical blocks need to be mapped to physical blocks.
This mapping is done by File organization module. It is
also responsible for free space management.
• Once File organization module decided which physical
block the application program needs, it passes this
information to basic file system. The basic file system is
responsible for issuing the commands to I/O control in
order to fetch those blocks.
• I/O controls contain the codes by using which it can
access hard disk. These codes are known as device
drivers. I/O controls are also responsible for handling
interrupts.
File-System Implementation
File systems store several important data structures on the disk:
• A boot-control block, ( per volume can contain information needed
by the system to boot an operating system from that volume. If the
disk does not contain an operating system, this block can be empty. It
is typically the first block of a volume. In UFS(Unix File system), it is
called the boot block. In NTFS(New Technology File System), it is the
partition boot sector
• A volume control block, (per volume) contains volume (or partition)
details, such as the number of blocks in the partition, the size of the
blocks, a free-block count and free-block pointers, and a free-FCB
count and FCB pointers. In UFS, this is called a super block. In NTFS, it
is stored in the Master file table
• A directory structure (per file system) is used to organize the files. In
UFS, this includes file names and associated inode numbers. In NTFS,
it is stored in the master file table.
• A per-file FCB contains many details about the file. It has a unique
identifier number to allow association with a directory entry.
Free Space Management in Operating System
• Free space management is a critical aspect of operating systems as it
involves managing the available storage space on the hard disk or
other secondary storage devices.
• The operating system uses various techniques to manage free space
and optimize the use of storage devices.
• The free-space list records all free disk blocks-those not allocated to
some file or directory.
• To create a file, we search the free-space list for the required amount
of space and allocate that space to the new file. This space is then
removed from the free-space list. When a file is deleted, its disk space
is added to the free-space list. :
• Bit Vector
• Bit Vector is series or collection of bits
where each bit corresponds to a disk
block.
• The bit can take two values: 0 and 1: 0
indicates that the block is free and 1
indicates an allocated block.
• The given instance of disk blocks on the
disk in Figure 1 (where green blocks are
allocated) can be represented by a
bitmap of 16 bits
as: 1111000111111001.
• Advantages
• The advantages of the bit vector method are-
• It is simple to understand.
• It is an efficient method.
• It occupies less memory.

• Disadvantages
• The disadvantages of the bit vector method are-
• For finding a free block, the operating system may need to search the
entire bit vector.
• To detect the first 1 in a word that is not 0 using this method, special
hardware support is needed.
• Keeping the bit vector in the main memory is possible for smaller disks but
not for larger ones.
Linked List
• In this method, all the free blocks existing in
the disk are linked together in a linked list. The
address of the first free block is stored
somewhere in the memory. Each free block
contains a pointer that contains the address to
the next free block. The last free block points to
null, indicating the end of the linked list.
• For example, consider a disk having 16 blocks
where block numbers 3, 4, 5, 6, 9, 10, 11, 12,
13, and 14 are free, and the rest of the blocks,
i.e., block numbers 1, 2, 7, 8, 15 and 16 are
allocated to some files. If we maintain a linked
list, then Block 3 will contain a pointer to Block
4, and Block 4 will contain a pointer to Block 5.
• Advantages
• The advantages of the linked list method are-
• External fragmentation is prevented by linked list allocation.
• Directory only needs to hold the starting and ending pointers of the
file, linked list allocation places less strain on it.
• Disadvantages
• This method is inefficient since we need to read each block to
traverse the list, which takes more I/O time.
• There is an overhead in maintaining the pointer.
• There is no provision for random or direct memory access in linked
list allocation. We must search through the full linked list to locate the
correct block
Grouping
• The third method of free space management in operating systems is
grouping. This method is the modification of the linked list method.
• In this method, the first free block stores the addresses of the n free
blocks. The first n-1 of these blocks is free. The last block in
these n free blocks contains the addresses of the next n free blocks,
and so on.
For example, consider a disk having 16 blocks where block numbers 3, 4, 5, 6, 9,
10, 11, 12, 13, and 14 are free, and the rest of the blocks, i.e., block numbers 1, 2,
7, 8, 15 and 16 are allocated to some files.

If we apply the Grouping method considering n to be 3, Block 3 will store the


addresses of Block 4, Block 5, and Block 6. Similarly, Block 6 will store the
addresses of Block 9, Block 10, and Block 11. Block 11 will store the addresses of
Block 12, Block 13, and Block 14.
• Advantages
• The advantages of the grouping method are-
• The addresses of a large number of free blocks can be found quickly.
• This method has the benefit of making it simple to locate the
addresses of a collection of empty disk blocks.
• It's a modification of the free list approach. So, there is no need to
traverse the whole list.
• Disadvantages
• The space of one block is wasted in storing addresses. Since the nth
block is used to store the addresses of next n free blocks.
• We only save the address of the first free block since we are unable to
maintain a list of all n free disk addresses.
Counting
• In this method, a linked list is maintained but in addition to the
pointer to the next free block, a count of free contiguous blocks
that follow the first block is also maintained.

• Thus each free block in the disk will contain two things-
 A pointer to the next free block.
 The number of free contiguous blocks following

For example, consider a disk having 16 blocks where block


numbers 3, 4, 5, 6, 9, 10, 11, 12, 13, and 14 are free, and the
rest of the blocks, i.e., block numbers 1, 2, 7, 8, 15 and 16 are
allocated to some files.
• If we apply the counting method, Block 3 will point to Block 4
and store the count 4 (since Block 3, 4, 5, and 6 are contiguous).
Similarly, Block 9 will point to Block 10 and keep the count of 6
(since Block 9, 10, 11, 12, 13, and 14 are contiguous).
• Advantages
• Fast allocation of a large number of consecutive free blocks.
• Random access to the free block is possible.
• The overall list is smaller in size.
• Disadvantages
• Each free block requires more space for keeping the count in the disk.
• For efficient insertion, deletion, and traversal operations. We need to
store the entries in B-tree.
• The entire area is reduced.

You might also like