CH 2 File System Management
CH 2 File System Management
1 File concept
A file is a named collection of related information that is
recorded on secondary storage such as magnetic disks,
magnetic tapes and optical disks. In general, a file is a sequence
of bits, bytes, lines or records whose meaning is defined by the
files creator and user.
1 .avi
2 .dbf
3 .doc(x)
5 .htm
6 .html
7 .jpg
8 .mpg
9 .mid
MIDI music file
10 .mov
QuickTime movie
File Structure
A File Structure should be according to a required format that
the operating system can understand.
● A file has a certain defined structure according to its
type.
● A text file is a sequence of characters organized into
lines.
● A source file is a sequence of procedures and functions.
● An object file is a sequence of bytes organized into
blocks that are understandable by the machine.
● When operating system defines different file structures, it
also contains the code to support these file structure.
Unix, MS-DOS support minimum number of file
structure.
—------------------------------------------------------------------
2.2 File Attributes
The attributes of a file may vary a little on different operating
systems. However, the common file attributes are −
Name
This denotes the symbolic name of the file. The file name is the
only attribute that is readable by humans easily.
Identifier
This denotes the file name for the system. It is usually a number
and uniquely identifies a file in the file system.
Type
If there are different types of files in the system, then the type
attribute denotes the type of file.
Location
This points to the device that a particular file is stored on and
also the location of the file on the device.
Size
This attribute defines the size of the file in bytes,words or
blocks. It may also specify the maximum allowed file size.
Protection
The protection attribute contains protection information for the
file such as who can read or write on the file.
-----------------------------------------------------------------------
● File Types:-
File type refers to the ability of the operating system to
distinguish different types of file such as text files source files
and binary files etc. Many operating systems support many
types of files. Operating system like MS-DOS and UNIX have
the following types of files −
1)Ordinary files
● These are the files that contain user information.
● These may have text, databases or executable program.
● The user can apply various operations on such files like
add, modify, delete or even remove the entire file.
2)Directory files
● These files contain list of file names and other
information related to these files.
3)Special files
● These files are also known as device files.
● These files represent physical device like disks,
terminals, printers, networks, tape drive etc.
These files are of two types −
● Character special files − data is handled character by
character as in case of terminals or printers.
● Block special files − data is handled in blocks as in the
case of disks and tapes.
-------------------------------------------------------------------
1. Sequential access
2. Direct/Random access
3. Indexed sequential access
Sequential access
A sequential access is that in which the records are accessed in
some sequence, i.e., the information in the file is processed in
order, one record after the other. This access method is the
most primitive one. Example: Compilers usually access files
in this fashion.
Direct/Random access
● Random access file organization provides, accessing the
records directly.
● Each record has its own address on the file with by the
help of which it can be directly accessed for reading or
writing.
● The records need not be in any sequence within the file
and they need not be in adjacent locations on the storage
medium.
Indexed sequential access
● This mechanism is built up on base of sequential access.
● An index is created for each file which contains pointers
to various blocks.
● Index is searched sequentially and its pointer is used to
access the file directly.
—------------------------------------------------------------------
2.5 DIRECTORY OVERVIEW:-
Root Directory:-
Root Directory is created when we start formatting the disk and
start putting files on it. In this, we can create new directories
called "sub-directories". Root directory is the highest level
directory and is seen when booting a system.
Subdirectory:-
Subdirectory is a directory inside root directory, in turn, it can
have another sub-directory in it.
A directory is a container that is used to contain folders and
file. It organises files and folders into hierarchical manner.
Single-level directory –
Single level directory is simplest disectory structure.In it
all files are contained in same directory which make it
easy to support and understand.
A single level directory has a significant limitation,
however, when the number of files increases or when the
system has more than one user. Since all the files are in
the same directory, they must have the unique name . if
two users call there dataset test, then the unique name
rule violated.
Advantages:
○ Since it is a single directory, so its implementation
is very easy.
○ If files are smaller in size, searching will faster.
○ The operations like file creation, searching, deletion,
updating are very easy in such a directory structure.
Disadvantages:
○ There may chance of name collision because two
files can not have the same name.
○ Searching will become time taking if directory will
large.
○ In this can not group the same type of files together.
Two-level directory –
As we have seen, a single level directory often leads to
confusion of files names among different users. the
solution to this problem is to create a separate directory
for each user.
In the two-level directory structure, each user has there
own user files directory (UFD). The UFDs has similar
structures, but each lists only the files of a single user.
system’s master file directory (MFD) is searches
whenever a new user id=s logged in. The MFD is
indexed by username or account number, and each entry
points to the UFD for that user.
Advantages:
○ We can give full path like /User-name/directory-
name/.
○ Diffrent users can have same directory as well as
file name.
○ Searching of files become more easy due to path
name and user-grouping.
Disadvantages:
○ A user is not allowed to share files with other users.
○ Still it not very scalable, two files of the same type
cannot be grouped together in the same user.
Tree-structured directory –
Once we have seen a two-level directory as a tree of
height 2, the natural generalization is to extend the
directory structure to a tree of arbitrary height.
This generalization allows the user to create there own
subdirectories and to organise on their files accordingly.
Advantages:
○ We can share files.
○ Searching is easy due to different-different paths.
Disadvantages:
○ We share the files via linking, in case of deleting it
may create the problem,
○ If the link is softlink then after deleting the file we
left with a dangling pointer.
○ In case of hardlink, to delete a file we have to delete
all the reference associated with it.
General graph directory structure –
In general graph directory structure, cycles are allowed
within a directory structure where multiple directories
can be derived from more than one parent directory.
The main problem with this kind of directory structure is
to calculate total size or space that have been taken by
the files and directories.
Advantages:
○ It allows cycles.
○ It is more flexible than other directories structure.
Disadvantages:
○ It is more costly than others.
○ It needs garbage collection.
—------------------------------------------------------------------
Contiguous Allocation
● Each file occupies a contiguous address space on disk.
● Assigned disk address is in linear order.
● Easy to implement.
● External fragmentation is a major issue with this type of
allocation technique.
●
Linked Allocation
● Each file carries a list of links to disk blocks.
● Directory contains link / pointer to first block of a file.
● No external fragmentation
● Effectively used in sequential access file.
● Inefficient in case of direct access file.
Indexed Allocation
● Provides solutions to problems of contiguous and linked
allocation.
● A index block is created having all pointers to files.
● Each file has its own index block which stores the
addresses of disk space occupied by the file.
● Directory contains the addresses of index blocks of files.
—------------------------------------------------------------------------
The system keeps tracks of the free disk blocks for allocating
space to files when they are created. Also, to reuse the space
released from deleting the files, free space management
becomes crucial. The system maintains a free space list which
keeps track of the disk blocks that are not allocated to some file
or directory. The free space list can be implemented mainly as:
Advantages –
○ Simple to understand.
○ Finding the first free block is efficient. It requires
scanning the words (a group of 8 bits) in a bitmap
for a non-zero word. (A 0-valued word has all bits
0). The first free block is then found by scanning for
the first 1 bit in the non-zero word.
The block number can be calculated as:
(number of bits per word) *(number of 0-values words) +
offset of bit first bit 1 in the non-zero word .
For the Figure-1, we scan the bitmap sequentially for the first
non-zero word.
The first group of 8 bits (00001110) constitute a non-zero
word since all bits are not 0. After the non-0 word is found,
we look for the first 1 bit. This is the 5th bit of the non-zero
word. So, offset = 5.
Therefore, the first free block number = 8*0+5 = 5.
Linked List –
In this approach, the free disk blocks are linked together i.e. a
free block contains a pointer to the next free block. The block
number of the very first disk block is stored at a separate
location on disk and is also cached in memory.
Space maps:-