0% found this document useful (0 votes)
14 views27 pages

CH 2 File System Management

The document provides an overview of file concepts, attributes, operations, types, access mechanisms, directory structures, space allocation methods, and free space management in operating systems. It explains the characteristics of various file types, such as ordinary, directory, and special files, and details the different directory structures like single-level, two-level, and tree-structured directories. Additionally, it discusses file allocation techniques including contiguous, linked, and indexed allocation, as well as methods for managing free space on disk.

Uploaded by

sahilbro8698
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views27 pages

CH 2 File System Management

The document provides an overview of file concepts, attributes, operations, types, access mechanisms, directory structures, space allocation methods, and free space management in operating systems. It explains the characteristics of various file types, such as ordinary, directory, and special files, and details the different directory structures like single-level, two-level, and tree-structured directories. Additionally, it discusses file allocation techniques including contiguous, linked, and indexed allocation, as well as methods for managing free space on disk.

Uploaded by

sahilbro8698
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

2.

1 File concept
A file is a named collection of related information that is
recorded on secondary storage such as magnetic disks,
magnetic tapes and optical disks. In general, a file is a sequence
of bits, bytes, lines or records whose meaning is defined by the
files creator and user.

Sr.No. Extension & Description

1 .avi

Microsoft videos for Windows movie

2 .dbf

dbase II, III, IV data file

3 .doc(x)

Microsoft word for windows


4 .gif

Graphics Interchange Format

5 .htm

Hypertext Markup Language

6 .html

Hypertext Markup Language

7 .jpg

JPEG graphics file

8 .mpg

MPEG video file

9 .mid
MIDI music file

10 .mov

QuickTime movie

File Structure
A File Structure should be according to a required format that
the operating system can understand.
● A file has a certain defined structure according to its
type.
● A text file is a sequence of characters organized into
lines.
● A source file is a sequence of procedures and functions.
● An object file is a sequence of bytes organized into
blocks that are understandable by the machine.
● When operating system defines different file structures, it
also contains the code to support these file structure.
Unix, MS-DOS support minimum number of file
structure.
—------------------------------------------------------------------
2.2 File Attributes
The attributes of a file may vary a little on different operating
systems. However, the common file attributes are −
Name
This denotes the symbolic name of the file. The file name is the
only attribute that is readable by humans easily.
Identifier
This denotes the file name for the system. It is usually a number
and uniquely identifies a file in the file system.
Type
If there are different types of files in the system, then the type
attribute denotes the type of file.
Location
This points to the device that a particular file is stored on and
also the location of the file on the device.
Size
This attribute defines the size of the file in bytes,words or
blocks. It may also specify the maximum allowed file size.
Protection
The protection attribute contains protection information for the
file such as who can read or write on the file.
-----------------------------------------------------------------------

2.3 Operations on Files


The operations that can performed on a file are −
Creating a file
To create a file, there should be space in the file system. Then
the entry for the new file must be made in the directory. This
entry should contain information about the file such as its name,
its location etc.
Reading a file
To read from a file, the system call should specify the name and
location of the file. There should be a read pointer at the
location where the read should take place. After the read
process is done, the read pointer should be updated.
Writing a file
To write into a file, the system call should specify the name of
the file and the contents that need to be written. There should
be a write pointer at the location where the write should take
place. After the write process is done, the write pointer should
be updated.
Deleting a file
The file should be found in the directory to delete it. After that
all the file space is deleted so it can be reused by other files.
Repositioning in a file
This is also known as file seek. To reposition a file, the current
file value is set to the appropriate entry. This does not require
any actual I/O operations.
Truncating a file
This deletes the data from the file without destroying all its
attributes. Only the file length is reset to zero and the file
contents are erased. The rest of the attributes remain the same.
—------------------------------------------------------------------

● File Types:-
File type refers to the ability of the operating system to
distinguish different types of file such as text files source files
and binary files etc. Many operating systems support many
types of files. Operating system like MS-DOS and UNIX have
the following types of files −

1)Ordinary files
● These are the files that contain user information.
● These may have text, databases or executable program.
● The user can apply various operations on such files like
add, modify, delete or even remove the entire file.

2)Directory files
● These files contain list of file names and other
information related to these files.

3)Special files
● These files are also known as device files.
● These files represent physical device like disks,
terminals, printers, networks, tape drive etc.
These files are of two types −
● Character special files − data is handled character by
character as in case of terminals or printers.
● Block special files − data is handled in blocks as in the
case of disks and tapes.

-------------------------------------------------------------------

● 2.4 File Access Mechanisms


File access mechanism refers to the manner in which the
records of a file may be accessed. There are several ways to
access files −

1. Sequential access
2. Direct/Random access
3. Indexed sequential access

Sequential access
A sequential access is that in which the records are accessed in
some sequence, i.e., the information in the file is processed in
order, one record after the other. This access method is the
most primitive one. Example: Compilers usually access files
in this fashion.

Direct/Random access
● Random access file organization provides, accessing the
records directly.
● Each record has its own address on the file with by the
help of which it can be directly accessed for reading or
writing.
● The records need not be in any sequence within the file
and they need not be in adjacent locations on the storage
medium.
Indexed sequential access
● This mechanism is built up on base of sequential access.
● An index is created for each file which contains pointers
to various blocks.
● Index is searched sequentially and its pointer is used to
access the file directly.
—------------------------------------------------------------------
2.5 DIRECTORY OVERVIEW:-

Directory is a place/area/location where a set of file(s) will be


stored. It is a folder which contains details about files, file size
and time when they are created and last modified. The
different types of directories are discussed below −

Root Directory:-
Root Directory is created when we start formatting the disk and
start putting files on it. In this, we can create new directories
called "sub-directories". Root directory is the highest level
directory and is seen when booting a system.

Subdirectory:-
Subdirectory is a directory inside root directory, in turn, it can
have another sub-directory in it.
A directory is a container that is used to contain folders and
file. It organises files and folders into hierarchical manner.

There are several logical structures of directory, these are given


as below.

Single-level directory –
Single level directory is simplest disectory structure.In it
all files are contained in same directory which make it
easy to support and understand.
A single level directory has a significant limitation,
however, when the number of files increases or when the
system has more than one user. Since all the files are in
the same directory, they must have the unique name . if
two users call there dataset test, then the unique name
rule violated.

Advantages:
○ Since it is a single directory, so its implementation
is very easy.
○ If files are smaller in size, searching will faster.
○ The operations like file creation, searching, deletion,
updating are very easy in such a directory structure.

Disadvantages:
○ There may chance of name collision because two
files can not have the same name.
○ Searching will become time taking if directory will
large.
○ In this can not group the same type of files together.
Two-level directory –
As we have seen, a single level directory often leads to
confusion of files names among different users. the
solution to this problem is to create a separate directory
for each user.
In the two-level directory structure, each user has there
own user files directory (UFD). The UFDs has similar
structures, but each lists only the files of a single user.
system’s master file directory (MFD) is searches
whenever a new user id=s logged in. The MFD is
indexed by username or account number, and each entry
points to the UFD for that user.
Advantages:
○ We can give full path like /User-name/directory-
name/.
○ Diffrent users can have same directory as well as
file name.
○ Searching of files become more easy due to path
name and user-grouping.
Disadvantages:
○ A user is not allowed to share files with other users.
○ Still it not very scalable, two files of the same type
cannot be grouped together in the same user.

Tree-structured directory –
Once we have seen a two-level directory as a tree of
height 2, the natural generalization is to extend the
directory structure to a tree of arbitrary height.
This generalization allows the user to create there own
subdirectories and to organise on their files accordingly.

A tree structure is the most common directory structure.


The tree has a root directory, and every file in the system
have a unique path.
Advantages:
○ Very generalize, since full path name can be given.
○ Very scalable, the probability of name collision is
less.
○ Searching becomes very easy, we can use both
absolute path as well as relative.
Disadvantages:
○ Every file does not fit into the hierarchical model,
files may be saved into multiple directories.
○ We can not share files.
○ It is inefficient, because accessing a file may go
under multiple directories.
Acyclic graph directory –
An acyclic graph is a graph with no cycle and allows to
share subdirectories and files. The same file or
subdirectories may be in two different directories. It is a
natural generalization of the tree-structured directory.
It is used in the situation like when two programmers are
working on a joint project and they need to access files.
The associated files are stored in a subdirectory,
separated them from other projects and files of other
programmers since they are working on a joint project so
they want to the subdirectories into there own directories.
The common subdirectories should be shared. So here we
use Acyclic directories.
It is the point to note that shared file is not the same as
copy file if any programmer makes some changes in the
subdirectory it will reflect in both subdirectories.

Advantages:
○ We can share files.
○ Searching is easy due to different-different paths.
Disadvantages:
○ We share the files via linking, in case of deleting it
may create the problem,
○ If the link is softlink then after deleting the file we
left with a dangling pointer.
○ In case of hardlink, to delete a file we have to delete
all the reference associated with it.
General graph directory structure –
In general graph directory structure, cycles are allowed
within a directory structure where multiple directories
can be derived from more than one parent directory.
The main problem with this kind of directory structure is
to calculate total size or space that have been taken by
the files and directories.

Advantages:
○ It allows cycles.
○ It is more flexible than other directories structure.
Disadvantages:
○ It is more costly than others.
○ It needs garbage collection.

—------------------------------------------------------------------

● 2.6 Space Allocation OR ACCESSING METHODS:-


Files are allocated disk spaces by the operating system.
Operating systems deploy the following three main ways to
allocate disk space to files.
1. Contiguous Allocation
2. Linked Allocation
3. Indexed Allocation

Contiguous Allocation
● Each file occupies a contiguous address space on disk.
● Assigned disk address is in linear order.
● Easy to implement.
● External fragmentation is a major issue with this type of
allocation technique.

Linked Allocation
● Each file carries a list of links to disk blocks.
● Directory contains link / pointer to first block of a file.
● No external fragmentation
● Effectively used in sequential access file.
● Inefficient in case of direct access file.
Indexed Allocation
● Provides solutions to problems of contiguous and linked
allocation.
● A index block is created having all pointers to files.
● Each file has its own index block which stores the
addresses of disk space occupied by the file.
● Directory contains the addresses of index blocks of files.

—------------------------------------------------------------------------

2.7 Free space management :-

The system keeps tracks of the free disk blocks for allocating
space to files when they are created. Also, to reuse the space
released from deleting the files, free space management
becomes crucial. The system maintains a free space list which
keeps track of the disk blocks that are not allocated to some file
or directory. The free space list can be implemented mainly as:

Bitmap or Bit vector –


A Bitmap or Bit Vector is series or collection of bits where
each bit corresponds to a disk block. The bit can take two
values: 0 and 1: 0 indicates that the block is allocated and 1
indicates a free block.
The given instance of disk blocks on the disk in Figure 1
(where green blocks are allocated) can be represented by a
bitmap of 16 bits as: 0000111000000110.

Advantages –
○ Simple to understand.
○ Finding the first free block is efficient. It requires
scanning the words (a group of 8 bits) in a bitmap
for a non-zero word. (A 0-valued word has all bits
0). The first free block is then found by scanning for
the first 1 bit in the non-zero word.
The block number can be calculated as:
(number of bits per word) *(number of 0-values words) +
offset of bit first bit 1 in the non-zero word .

For the Figure-1, we scan the bitmap sequentially for the first
non-zero word.
The first group of 8 bits (00001110) constitute a non-zero
word since all bits are not 0. After the non-0 word is found,
we look for the first 1 bit. This is the 5th bit of the non-zero
word. So, offset = 5.
Therefore, the first free block number = 8*0+5 = 5.
Linked List –
In this approach, the free disk blocks are linked together i.e. a
free block contains a pointer to the next free block. The block
number of the very first disk block is stored at a separate
location on disk and is also cached in memory.

In Figure-2, the free space list head points to Block 5 which


points to Block 6, the next free block and so on. The last free
block would contain a null pointer indicating the end of free
list.
A drawback of this method is the I/O required for free space
list traversal.
Grouping –
This approach stores the address of the free blocks in the first
free block. The first free block stores the address of some, say
n free blocks. Out of these n blocks, the first n-1 blocks are
actually free and the last block contains the address of next
free n blocks.
An advantage of this approach is that the addresses of a
group of free disk blocks can be found easily.
Counting –
This approach stores the address of the first free disk block
and a number n of free contiguous disk blocks that follow the
first block.
Every entry in the list would contain:
○ Address of first free disk block
○ A number n
For example, in Figure-1, the first entry of the free space
list would be: ([Address of Block 5], 2), because 2
contiguous free blocks follow block 5.

Space maps:-

In its management of free space, ZFS uses a


combination of techniques to control the size of data
structures and minimize the I/O needed to manage those
structures. First, ZFS creates metaslabs to divide the space on
the device into chunks of manageable size. A given volume
may contain hundreds of metaslabs. Each metaslab has an
associated space map. ZFS uses the counting algorithm to
store information about free blocks. Rather than write
counting structures to disk, it uses log-structured file-system
techniques to record them

An in-memory space map is constructed using a balanced


tree data structure, constructed from the log data. The
combination of the in-memory tree and the on-disk log
provide for very fast and efficient management of these
very large files and free blocks.

You might also like