5-FileSystem
5-FileSystem
Files are managed by the operating system. The part of the Operating System that
deals with the files is called File System. File System describes how files are named,
accessed, used, protected and implemented in Operating System.
A file is a collection of related information. File may contain data or program.
Program can be a source file or it can be an object file or some other too. Data can be
numeric, alphabetic or alphanumeric etc. So different types of information can be stored
in a file i.e. source or object program, numeric, alphabetic or alphanumeric data,
graphical images, movie clips and sounds etc. The structure of a file defined is according
to its type e.g. the structure of a file containing the graphical images will be di fferent
from the structure of the file containing a source program, or textual information.
Every file in the operating system has a name. The naming rules differ from one
operating system to another. Information stored in the file depends upon the typ e of
information, which ultimately becomes the type of a file e.g. a text file contains text in it
whereas a graphical file contains picture in it etc.
Apart from file name and file type, other important properties are the creation date
and time of a file, creator/owner of the file, size of the file and its permission or
attributes.
An important consideration in the design of an operating system is whether it
supports file types or not. If operating system knows or recognizes the file type and the
data contained in it then the file can be manipulated in a better manner, otherwise the data
or information stored in the file is of no use to the operating system.
If an operating system knows different types of files etc then one disadvantage is
obvious. If the operating system defines 20 different file types then there must be the
code to support these files types. Another problem will be for the files whose file type is
not supported. Many operating systems use this approach e.g. Windows 95/98 etc.
Another technique adopted by some Operating Systems is of no file type. UNIX
Operating System uses this technique and considers each file as a sequence of bytes. This
technique gives flexibility but less support, as each program should include its own code
to change the data into its structures.
Internal Fragmentation
Files are stored on disks. Disks systems have a well -defined block size
determined by the size of a sector. Disk I/O is in units of one block and all blocks are of
the same size. As the disk s pace is always allocated in blocks so some portion of last
block of each file is wasted. If block size is 512 bytes, then a file of 1949 bytes will be
allocated four blocks (2048 bytes) and the last 99 bytes would be wasted. So, the wasted
bytes allocated to keep everything in unit of blocks (instead of bytes) are Internal
Fragmentation. All the systems suffer from Internal Fragmentation, if block size will be
longer than Internal Fragmentation will be greater too. So the wasted bytes allocated to
keep everything in units of blocks (instead of bytes) are internal fragmentation.
i) Device Directory
ii) File Directory
Device Directory describes all files on that device. So, device directory describes
the Physical properties of a file i.e. file location and its size etc.
File Directory describes the logical properties of files i.e. file name, file type,
files owner name and files permissions etc. For the Physical properties of a file the File
Directory can point to the Device Directory.
The type of information about files stored in a directory differs from Operating
System to Operating System. Information’s that c an be kept in a file directory are
File Name
Contains the name of the file.
File Type
Contains the information about the type of file (used where the system supports files
of different types)
Location
Location of the file
Size
Size of the file in bytes
Current Position
Position of the pointer in the file
Protection
Contains files Protection Information, i.e. who can read, write or execute the files.
Usage
A value indicating the usage of the file
Process Identification
Identification number when the file is executed
Hash Table
Another data structure used is a hash -table. In hash-table the search time is fast
and Insertion and Deletion of file is also simple. Problem here is the fixed size of hash
table entries, i.e. A hash-table of 64 entries converts file names into integers from 0 -63.
To add 65 th entry the size of the hash-table will have to be increased and for this all the
existing entries will be changed again to reflect the correct entry in the table.
Access Methods
There are two ways through which the information stored in the file can be
accessed. One is called the Sequential Access whereas the other is called Random or
Direct Access.
Sequential Access
In early operating systems, access method was sequential. Pr ocesses can read all the
bytes or records of the file in sequential order e.g. starting from the first and read all of
them one by one in a sequence. For storage medium like tape drive sequential access
method is used.
Random/Direct Access
Files whose records etc can be accessed in any order is called “Random or Direct
Access”. Random or direct access was possible when we started storing the information
on Direct Access Storage Devices (DASD) i.e. Hard Disk and Floppy Diskettes etc. In
random or direct access, all the records of files are stored on the basis of some key. So
any record can very easily be found using this method.
Directories
Directories are used in Operating Systems in order to keep track of files.
Operations that can be performed on a directory are
Search
A directory contains many files in it, so Operating System should be able to search a
file in a directory that matches specified criteria. i.e. “dir” and “find” etc
Create File
Operating System should be able to create and add new files in the directory . i.e.
“edit” and “vi” etc
Delete File
Operating System should be able to remove files from the directory that are no longer
needed i.e. “del”, “erase” and “rm” etc
List Directory
Operating System should be able to provide the list of files and directori es present in
a directory i.e. “dir” and “ls” etc
Backup
Operating System should be able to provide backup facility so that the important file
or files in a directory can be copied to some other device too i.e. tape etc. Like this if
an important file is lost due to some reason then we already have a copy of that file on
some other device, so Backup saved us from trouble.
Single-Level Directory
Single-Level Directory is the simplest directory structure. In Single -level
directory, all the files are prese nt on the same directory. Single -level directory structure
has limitations too, i.e. as all files are present in one directory, so each file should have a
unique name. Another problem in the Single -level directory is that when the number of
file increases, it becomes difficult to manage the files.
etc
files
Single-Level Directory
Two-Level Directory
The real problem in the Single -level directory is the confusion between the file
names of different users as all the users place/store/keep their files in the same directory.
Solution for such kind of problem is to have a separate directory for each user. On large
systems, directory organization is logical rather than physical.
In the Two-level directory structure, each user has its own directory where he
creates and stores his files. Users own directory is also called User File Directory or
UFD. When the user logs in, the system Master File Directory MFD is searched. The
MFD is indexed by user Account Number where each entry points to a UFD.
When a user tries to find a file then only that users file directory is checked for the
existence of the file. So, in this way different users can have files with the same name as
each users file is created in his directory.
To create a file for a user, the Operating System searches only that user directory
to confirm the newly created file name is unique in the current directory. Normally a user
can create or delete the file only in his directory.
A special system program is used when necessary to create or delete user
directories. This program creates or deletes the user directory and adds the entry in the
Master File Directory. Only authorized persons should be allowed to use this program.
There are advantages as well as disadvantages of two -level directory structure.
Advantage is that a user is completely independent i.e. no other user can access his files.
Disadvantage starts if a user wishes to access the files of other users.
Files
Two-Level Directory
Tree-Structured Directories
In Two-Level directories, two level directory was taken as a tree of height 2. Here
the same concept is used i.e. tree that can be of any height. Its benefit is that users can
create their own sub-directories and organize their files.
File system of MS-DOS is based on Tree Structured Directories. There is a root
directory in the tree and every file in the file system has a unique path name.
dec
i) Absolute Path
ii) Relative Path
Absolute Path begins from the root and goes a path down to the specified file,
showing the directory names on the path i.e. “ /usr/test/clear” or “\turboc\bin\tc.exe”
Acyclic-Graph Directories
Sharing of files and directories etc is not possible in Tree -Structured directories,
but files and directories can be shared in Acyclic -Graph Directories.
In Acyclic-Graph the shared file or directory exists in the file system in two or
more places at the same time. Shared file or directory doesn’t mean two or more copies.
If two copies of same file are pre sent in different directories than changing a copy of the
file in one directory doesn’t mean the change in the second file automatically present in
other directory. In shared files there is only one actual file so any change made in the file
is visible to all the other. So a new file created or copied in a shared directory will
immediately be visible to all the users sharing that directory.
Although Acyclic-Graph Directory structure is flexible than a simple Tree
structure but is more complex.
lst rate w7
General-Graph Directory
A problem in the use of Acyclic -Graph structure is to make sure that there are no
cycles. When we allow users to create sub -directories and files in a Two -Level directory
then a Tree-Structured directory is formed, and if we continue to add new files and
directories in Tree-Structured directory then Tree -Structured directory retains its
File Protection
Files stored in the computers need to be protected from the physical damage
(reliability) as well as unauthorized access (protection).
Reliability
For reliability, we already take the backups etc of the files, so in case of hardware
failure i.e. errors in reading or writing, power failures etc and in case of software problem
i.e. bugs in the file system software, we have the copy of important files at some other
place too.
Normally protection is provided at the lower level i.e. a user who has access to read a
file can also copy and print it.
Protection of Directories is different from the protection of files. Directory
protections may be the creation and deletion of files in a directory or even restricting the
listing of files in a directory.
Similarly, Protection of Users/Groups can even be different from the protection of
files or directories etc i.e. as in the case of Windows 2000.