100% found this document useful (1 vote)
1K views

OS Unit - 5 Notes

This document discusses file systems and file access methods. It provides definitions of key concepts like what a file is, file attributes, and common file operations. It also describes different file access methods like sequential access, direct access, and indexed sequential access. Sequential access involves accessing file records in order, direct access allows random access to any block, and indexed sequential access uses an index to perform direct block access followed by sequential record access within blocks.

Uploaded by

hahaha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views

OS Unit - 5 Notes

This document discusses file systems and file access methods. It provides definitions of key concepts like what a file is, file attributes, and common file operations. It also describes different file access methods like sequential access, direct access, and indexed sequential access. Sequential access involves accessing file records in order, direct access allows random access to any block, and indexed sequential access uses an index to perform direct block access followed by sequential record access within blocks.

Uploaded by

hahaha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

R20 CMR Technical Campus

UNIT-V
File System Interface and Operations -Access methods, Directory Structure, Protection, File
System Structure, Allocation methods, Free-space Management. Usage of open, create, read,
write, close, lseek, stat, ioctl system calls.

1. INTRODUCTION TO FILE CONCEPTS


Computers can store information permanently on various storage media such as, magnetic
disks, magnetic tapes, optical disks. The physical storage content is converted into a logical
storage unit by Operating System. The logical storage unit is called FILE.

A file is a named collection of related information that is recorded on secondary storage. From a
user's perspective, a file is the smallest allotment of logical secondary storage; that is, data
cannot be written to secondary storage unless they are within a file. In general, a file is a
sequence of bits, bytes, lines, or records, the meaning of which is defined by the file's creator and
user. Many different types of information may be stored in a file—source programs, object
programs, executable programs, numeric data, text, payroll records, graphic images, sound
recordings, and so on.

FILE ATTRIBUTES: A file is named, for the convenience of its human users, and is referred to by
its name. A name is usually a string of characters, such as example.c. Some systems differentiate between
uppercase and lowercase characters in names, whereas other systems do not.
A file's attributes vary from one operating system to another but typically consist of these:

 Name: A file is named for the convenience of the user and is referred by its name. A
name is usually a string of characters.
 Identifier: This unique tag, usually a number, identifies the file within the file system.

 Type: Files are of so many types. The type depends on the extension of the file. The
different types of files details are given in below table.
 Location: This is a pointer to the location of the file on storage device.

 Size: The current size of the file (in bytes, words, blocks).

Syeda Sumaiya Afreen pg. 1


R20 CMR Technical Campus

 Protection: Access control information determines who can do reading, writing,


executing and so on.
 Time, Date, User identification: This information may be kept for creation, last
modification, and last use.

FILE OPERATIONS

 Creating a file: First OS check whether free space is available or not. If there is no free
space available, file cannot be created. If the space is available, then file is created and an

entry for the new file is added in the directory. The entry includes file attributes such as file
name, file location, size, etc.
 Writing a file: The OS search for the given file. If the file is not found, then a new file is
created with the given name. If the file is found, it opens the existing file. The system set a
write pointer to the location in the file where the next write is to take place. After each write
operation taken place, the write pointer is updated.
 Reading a file: To read a file, first of all we search the directories for the given file, if the
file is found; the system needs to keep a read pointer to the location in the file where the next
read is to take place. After each read operation taken place, the read pointer is updated.
 Repositioning within a file: This operation is also called file seek. The current file position
pointer is changed to a given value.
 Deleting a file: To delete a file, first search the directory for the given file name, then release
the file space and erase the directory entry.
 Truncating a file: The user may want to erase the contents of a file but keep its attributes. To truncate
a file, the file total content is erased but, the file exist as it is.

Syeda Sumaiya Afreen pg. 2


R20 CMR Technical Campus

FILE TYPES:

The name of the file consists of 2 parts. One is name and second is extension. The file type is
depending on extension of the file. The extension of the file defines what type of file it is and
with what application software is used to open or run it.

File Type Extension Purpose


Executable .exe , .com , .bin Ready to run machine language program
Source code .c , .cpp , .asm Source code in various languages.
Object .obj , .o Compiled, machine language not linked
Batch .bat , .sh Commands to the command interpreter
Text .txt , .doc Textual data, documents
Word processor .doc , .wp , .rtf Various word processors or formats
Library .lib , .dll Library routines for programmers
.gif , .pdf , .jpg ASCII or Binary file in a format for
Print or View
printing or viewing
.arc , .zip Related files grouped into one file, sometimes
Archive
compressed for archiving or storage
Multimedia .mpeg , .mp3, .avi Binary file containing audio or A/V information

Figure: Common file types

Syeda Sumaiya Afreen pg. 3


R20 CMR Technical Campus

2. FILE ACCESS METHODS

Files stores information on secondary storage devices, this information must be accessed
and read into main memory. There are so many ways that the information in the file can be
accessed.
i. Sequential Access:

Information in the file is accessed in serial order i.e. one record after the other. Magnetic
tapes are supporting this type of file accessing. For example, editors and compilers usually
access files in this fashion. Reads and writes make up the bulk of the operations on a file. A read
operation—read next—reads the next portion of the file and automatically advances a file
pointer, which tracks the I/O location. Similarly, the write operation—write next—appends to
the end of the file and advances to the end of the newly written material (the new end of file)

Fig 2.1: Sequential-access file

For example, consider a file consisting of 100 records. Let the current position of read/write
head is at 45th record. Suppose next we want to read the 75th record then, it access sequentially
from 45, 46, 47 …….. 74, 75. Even though after 45th we need 75th, the read/write head traverse
all the records from 45 to 75. So, it is a time consuming method.

ii. Direct Access:

Direct access is also called relative access. Here records can read/write randomly without any
order. The direct access method is based on a disk model of a file, because disks allow random
access to any file block. For example, consider a disk containing of 512 blocks. Let the position
of read/write head is at 124th block. Suppose if the next block is to be read or write is 256th
block. Then we can jump from 124th block to 256th block directly without any restrictions.

Syeda Sumaiya Afreen pg. 4


R20 CMR Technical Campus

Fig 2.2 : Simulation of sequential access on a direct-accsss file.

Consider other example, let CD consists of 10 songs and at present we are listening song 3, if
we want to listen song 7, we can shift to 7. (In case of tape record cassette it is not possible).

iii. Indexed Sequential File Access

This access method is a combination of both the sequential access as well as direct access. The
file contains set of records and a group of records form a block. The main concept is to perform
direct access on the blocks and then sequential access on the records in a block. This access
method involves maintaining an index. The index is similar to an index in the text book. The
index contains 2 fields, a pointer field and a key field. To access data from a file, direct access is
performed to locate the block using pointer field and within the block the record is accessed
sequentially. Sometimes indexes may be big. So hierarchies of indexes are built in which one
direct access of an index leads to info to access another index directly and so on till the actual
file is accessed sequentially for the particular record.

Syeda Sumaiya Afreen pg. 5


R20 CMR Technical Campus

Fig 2.3 : Example of index and relative files

3. DIRECTORY STRUCTURE

As a computer system consisting of thousands to millions of files, it is very hard to manage


them. To manage these files, directory concept was introduced. The files are grouped and load
each group into one partition called a directory. In Windows we also call these directories as
folders. A directory structure provides a mechanism for organizing many files in the file system.
A directory can contain multiple files. It can even have other directories inside of them. The
directory contains information about the file attributes such as location, ownership, size, etc.

When considering a particular directory structure, we need to keep in mind the operations
that are to be performed on a directory:

 Search for a file. We need to be able to search a directory structure to find the entry
for a particular file. Since files have symbolic names and similar names may indicate
a relationship between files, we may want to be able to find all files whose names
match a particular pattern.

 Create a file. New files need to be created and added to the directory.

 Delete a file. When a file is no longer needed, we want to be able to remove it from
the directory.

Syeda Sumaiya Afreen pg. 6


R20 CMR Technical Campus

 List a directory. We need to be able to list the files in a directory and the contents of
the directory entry for each file in the list.

 Rename a file. Because the name of a file represents its contents to its users, we
must be able to change the name when the contents or use of the file changes.
Renaming a file may also allow its position within the directory structure to be
changed.

 Traverse the file system. We may wish to access every directory and every file
within a directory structure. For reliability, it is a good idea to save the contents and
structure of the entire file system at regular intervals. Often, we do this by copying
all files to magnetic tape. This technique provides a backup copy in case of system
failure. In addition, if a file is no longer in use., the file can be copied to tape and the
disk space of that file released for reuse by another file.

The directory structures supported by OS are:

i. Single level directory:

This directory system contains only one directory called as root directory. All the files are saved
in this directory only. When the number of files increases or when the system has more than one
user, single level directory is not useful. Since all the files are in the same directory, they must
have the unique name . For example, if user-1 creates a files called sample and then later user-2
also creates a file called sample, then user-2’s file will overwrite user-1 file.

Fig 3.1 : Single-level directory

ii. Two level directory:

In the two-level directory structure, the root directory is called as master file directory (MFD).
Syeda Sumaiya Afreen pg. 7
R20 CMR Technical Campus

Each user has their own user files directory (UFD). The MFD is indexed by username and each
entry points to the UFD for that user. The files of a particular user are stored in UFD. In this
model, Root directory is the MFD directory. The user1, user2, user3 and user4 are user level of
directories. F1, f2, …, f8 are files. Different users can have same file name. Efficient searching.
This is shown in the below diagram.

Fig 3.2 : Two level directory

iii. Tree structured directory:

Two level directory eliminates name conflicts among users but it is not satisfactory for users
with a large number (hundreds to millions) of files. To avoid this, each user can create the sub-
directories and load the same type of files into the sub-directory. Even a sub-directory can have
another sub-directory and so on. This can viewed as a tree like structure. So, here each user can
have as many as directories needed. The user can change his current directory whenever he
desires. If a file is not needed in the current directory then the user usually must either specify a
path name or change the current directory. Paths can be of two types:

a) Absolute Path: It Begins at root and follows a path down to the specified file.
Ex: : \Programs\e\hex

b) Relative Path: Defines a path from current directory.


Ex: : \find\p if programs is the current directory.

Syeda Sumaiya Afreen pg. 8


R20 CMR Technical Campus

Fig 3.3 : Tree-structured directory structure.

iv. Acyclic graph directory

When multiple users are working on the same project, the project files can be stored in a
common sub-directory and those files are shared among those multiple users. This type of
directory is called acyclic graph directory. The common directory will be declared as a shared
directory. The graph contain no cycles with shared files, changes made by one user are made
visible to other users. A file may now have multiple absolute paths. When shared directory or
shared file is deleted, all pointers to the directory or files are also to be removed. The user1 and
user2 shares same directory called count. This is shown in the below diagram.

Fig 3.4 : Acyclic-graph directory structure


Syeda Sumaiya Afreen pg. 9
R20 CMR Technical Campus

v. General graph directory:

When we add links to an existing tree structured directory, the tree structure is destroyed;
resulting is a simple graph structure. Cycles are allowed within a directory structure where
multiple directories can be derived from more than one parent directory. The advantage of this
type of directory is that traversing is easy and also sharing is possible. The primary advantage of
an acyclic graph is the relative simplicity of the algorithms to traverse the graph and to determine
when there are no more references to a file

Fig 3.5 : General graph directory

4. PROTECTION

When information is stored in a computer system, we want to keep it safe from physical
damage (reliability) and improper access (protection). Reliability is generally provided by duplicate
copies of files. Many computers have systems programs that automatically copy disk files to tape at
regular intervals (once per day or week or month) to maintain a copy should a file system be
accidentally destroyed. File systems can be damaged by hardware problems (such as errors in
reading or writing), power surges or failures, head crashes, dirt, temperature extremes, and
vandalism. Files may be deleted accidentally. Bugs in the file-system software can also cause file
contents to be lost. Protection can be provided in many ways. For a small single-user system, we
might provide protection by physically removing the floppy disks and locking them in a desk drawer
or file cabinet. In a multiuser system, however, other mechanisms are needed.

Syeda Sumaiya Afreen pg. 10


R20 CMR Technical Campus

Protection mechanisms provide controlled access by limiting types of file access that can
be made. Access is permitted or denied depending on several factors, such as the user type, the
access type requested.

Most common approach to the protection problem is to make access dependent on the
identity of the user. Different users need different types of access to a file. An access control list
(ACL) specifying user names and types of file access, OS checks the list (ACL) associated with
that file. If that user is listed for the requested access, the access is allowed. Otherwise protection
violation occurs, and user process is denied access to the file.

Access can be provided to the following class of users:

 Owner: The user who created the file is the owner.


 Group: A set of users who are sharing the file.
 Universe: All the other users in the system constitute the universe. 

The access types of file can be:


 Read ( r ) : Read from the file 
 Write ( w ) : Write/rewrite the file
 Execute ( x ) : load the file into memory & execute it
 No access ( - ): Not allowed to access
 Append : Write new information at the end of the file.
 Delete : Delete the file and tree its space for possible reuse. 
 List : List the name and attributes of the file

The general format of file access is given below:

Owner Group Universe


rwx rwx rwx

Example: file_name rwx rw- r- -

On the given file_name , the owner can perform read, write and execute, the group can perform
read and write, and the (other users) universe can perform only read access.

Syeda Sumaiya Afreen pg. 11


R20 CMR Technical Campus

Other Protection approaches: Maintain password for each file. Another approach to the
protection problem is to associate a password with each file. Just as access to the computer system is often
controlled by a password, access to each file can be controlled in the same way

Disadvantages

 Number of passwords that a user needs to remember may become large, if different
passwords set to different files.
 If only one password is used for all files, then once it is discovered, all files are
accessible.

5. FILE SYSTEM STRUCTURE


Disks provide the bulk of secondary storage on which a file system is maintained. They
have two characteristics that make them a convenient medium for storing multiple files: 1) A
disk can be rewritten in place; it is possible to read a block from the disk, modify the block, and
write it back into the same place. 2) A disk can access directly any given block of information it
contains. Rather than transferring a byte at a time, to improve I/O efficiency, I/O transfers
between memory and disk are performed in units of blocks. Each block has one or more sectors.

 To provide efficient and convenient access to the disk, the operating system imposes one or
more file systems to allow the data to be stored, located, and retrieved easily. A file System
must provide efficient mechanism to store the file, locate the file and retrieve the file in a
convenient way. Most of the Operating Systems use layering approach for every task
including file systems. Every layer of the file system is responsible for some activities.
The file system itself is generally composed of many different levels. Each level in the
design uses the features of lower levels to create new features for use by higher levels. The
image shown below, elaborates how the file system is divided in different layers, and
also thefunctionality of each layer.

Syeda Sumaiya Afreen pg. 12


R20 CMR Technical Campus

Fig 5.1 : Layered file system.

 When an application program asks for a file, the first request is directed to the logical file
system. The logical file system contains the Meta data of the file and directory structure.
It maintains file structure via file control blocks. A file control block (inode in Unix file
systems) contains information about the file, ownership, permissions, location of the file
contents. If the application program doesn't have the required permissions of the file then
this layer will throw an error. Logical file systems also verify the path to the file.

 Files are to be stored and retrieved from the hard disk. Hard disk is divided into various
tracks. Each track is divided into sectors. Each sector is divided into blocks. The file
content is divided into various logical blocks. Each logical block is mapped and stored
into Hard disk blocks. Therefore, in order to store and retrieve the files, the logical blocks
need to be mapped to physical blocks. This mapping is done by File organization module.
It is also responsible for free space management.

Syeda Sumaiya Afreen pg. 13


R20 CMR Technical Campus

 Once File organization module decided which physical block the application program
needs, it passes this information to basic file system. The basic file system is responsible
for issuing the commands to I/O control in order to fetch those blocks.
 I/O controls contain the codes by using which it can access hard disk. These codes are
known as device drivers. I/O controls are also responsible for handling interrupts to
transfer information between the main memory and the disk system. 

6. FILE SYSTEM IMPLEMENTATION


Several on-disk and in-memory structures are used to implement a file system. These structures
vary depending on the operating system and the file system.
1. On-disk Structures –
Generally they contain information about total number of disk blocks, free disk blocks,
location of them and etc. Below given are different on-disk structures :

i. Boot Control Block –


It is usually the first block of volume and it contains information needed to boot an operating
system.In UNIX it is called boot block and in NTFS it is called as partition boot sector.
ii. Volume Control Block –
It has information about a particular partition ex:- free block count, block size and block pointers
etc.In UNIX it is called super block and in NTFS it is stored in master file table.
iii. Directory Structure –
They store file names and associated inode numbers.In UNIX, includes file names and associated
file names and in NTFS, it is stored in master file table.
iv. Per-File FCB –
It contains details about files and it has a unique identifier number to allow association with
directory entry. In NTFS it is stored in master file table.

Syeda Sumaiya Afreen pg. 14


R20 CMR Technical Campus

Fig 6.1 : A typical file-control block.

2. In-Memory Structure :
The in-memory information is used for both file-system management and performance
improvement via caching. Several in-memory structures given below :

i. Mount Table – It contains information about each mounted volume.


ii. Directory-Structure cache – This cache holds the directory information of recently accessed
directories.
iii. System wide open-file table – It contains the copy of FCB of each open file.
iv. Per-process open-file table –
It contains information opened by that particular process and it maps with appropriate system
wide open-file.

Figure 6.2 illustrates some of the interactions of file system components when files are created
and/or used:

 When a new file is created, a new FCB is allocated and filled out with important information
regarding the new file. The appropriate directory is modified with the new file name and FCB
information.
 When a file is accessed during a program, the open( ) system call reads in the FCB information from
disk, and stores it in the system-wide open file table. An entry is added to the per-process open file
table referencing the system-wide table, and an index into the per-process table is returned by the

Syeda Sumaiya Afreen pg. 15


R20 CMR Technical Campus

open( ) system call. UNIX refers to this index as a file descriptor, and Windows refers to it as a file
handle.
 If another process already has a file open when a new request comes in for the same file, and it is
sharable, then a counter in the system-wide table is incremented and the per-process table is adjusted
to point to the existing entry in the system-wide table.
 When a file is closed, the per-process table entry is freed, and the counter in the system-wide table is
decremented. If that counter reaches zero, then the system wide table is also freed. Any data currently
stored in memory cache for this file is written out to disk if necessary.

Fig 6.2 : In-memory file-system structures. (a) File open. (b) File read.

Partitions and Mounting

 Physical disks are commonly divided into smaller units called partitions. They can also be combined into
larger units, but that is most commonly done for RAID installations and is left for later chapters.
 Partitions can either be used as raw devices ( with no structure imposed upon them ), or they can be formatted
to hold a filesystem ( i.e. populated with FCBs and initial directory structures as appropriate. ) Raw partitions
are generally used for swap space, and may also be used for certain programs such as databases that choose to
manage their own disk storage system. Partitions containing filesystems can generally only be accessed using
the file system structure by ordinary users, but can often be accessed as a raw device also by root.
 The boot block is accessed as part of a raw partition, by the boot program prior to any operating system being
loaded. Modern boot programs understand multiple OSes and filesystem formats, and can give the user a
choice of which of several available systems to boot.

Syeda Sumaiya Afreen pg. 16


R20 CMR Technical Campus

 The root partition contains the OS kernel and at least the key portions of the OS needed to complete the boot
process. At boot time the root partition is mounted, and control is transferred from the boot program to the
kernel found there. ( Older systems required that the root partition lie completely within the first 1024
cylinders of the disk, because that was as far as the boot program could reach. Once the kernel had control,
then it could access partitions beyond the 1024 cylinder boundary. )
 Continuing with the boot process, additional filesystems get mounted, adding their information into the
appropriate mount table structure. As a part of the mounting process the file systems may be checked for
errors or inconsistencies, either because they are flagged as not having been closed properly the last time they
were used, or just for general principals. Filesystems may be mounted either automatically or manually. In
UNIX a mount point is indicated by setting a flag in the in-memory copy of the inode, so all future references
to that inode get re-directed to the root directory of the mounted filesystem.

Virtual File Systems

 Virtual File Systems, VFS, provide a common interface to multiple different filesystem types. In addition, it
provides for a unique identifier ( vnode ) for files across the entire space, including across all filesystems of
different types. ( UNIX inodes are unique only across a single filesystem, and certainly do not carry across
networked file systems. )
 The VFS in Linux is based upon four key object types:
o The inode object, representing an individual file
o The file object, representing an open file.
o The superblock object, representing a filesystem.
o The dentry object, representing a directory entry.
 Linux VFS provides a set of common functionalities for each filesystem, using function pointers accessed
through a table. The same functionality is accessed through the same table position for all filesystem types,
though the actual functions pointed to by the pointers may be filesystem-specific. See /usr/include/linux/fs.h
for full details. Common operations provided include open( ), read( ), write( ), and mmap( ).

Syeda Sumaiya Afreen pg. 17


R20 CMR Technical Campus

Fig 6.3 : Schismatic view of a virtual file system.

Directory Implementation

Directories need to be fast to search, insert, and delete, with a minimum of wasted disk space.

Linear List

 A linear list is the simplest and easiest directory structure to set up, but it does have some drawbacks.
 Finding a file ( or verifying one does not already exist upon creation ) requires a linear search.
 Deletions can be done by moving all entries, flagging an entry as deleted, or by moving the last entry into
the newly vacant position.
 Sorting the list makes searches faster, at the expense of more complex insertions and deletions.
 A linked list makes insertions and deletions into a sorted list easier, with overhead for the links.
 More complex data structures, such as B-trees, could also be considered.

Hash Table

 A hash table can also be used to speed up searches.


 Hash tables are generally implemented in addition to a linear or other structure

7. FILE ALLOCATION METHODS


Syeda Sumaiya Afreen pg. 18
R20 CMR Technical Campus

An allocation method refers to how disk blocks are allocated for files. Three major methods of
allocating disk space are in wide use: contiguous, linked, and indexed. The aim is of achieving
Efficient disk utilization and faster access time.

i. Contiguous allocation:

 Contiguous Allocation requires that all blocks of a file be kept together contiguously.
 Disk addresses define a linear ordering on the disk. The directory entry for each file indicates the address of

the starting block and the length of the area allocated for this file as shown in the Figure below.

Fig 7.1 : Contiguous allocation of disk space

 Performance is very fast, because reading successive blocks of the same file generally requires no
movement of the disk heads, or at most one small step to the next adjacent cylinder.

 Accessing a file that has been allocated contiguously is easy. For sequential access, the
file system remembers the disk address of the last block referenced and, when
necessary, reads the next block. For direct access to block /' of a file that starts at block
b, we can immediately access block b + i. Thus, both sequential and direct access can
be supported by contiguous allocation.
Syeda Sumaiya Afreen pg. 19
R20 CMR Technical Campus

 Storage allocation involves the same issues discussed earlier for the allocation of contiguous blocks
of memory ( first fit, best fit, fragmentation problems, etc. ) The distinction is that the high time
penalty required for moving the disk heads from spot to spot may now justify the benefits of keeping
files contiguously when possible.
 ( Even file systems that do not by default store files contiguously can benefit from certain utilities
that compact the disk and make all files contiguous in the process. )
 Problems can arise when files grow, or if the exact size of a file is unknown at creation time:
o Over-estimation of the file's final size increases external fragmentation and wastes disk space.
o Under-estimation may require that a file be moved or a process aborted if the file grows
beyond its originally allocated space.
o If a file grows slowly over a long time period and the total final space must be allocated
initially, then a lot of space becomes unusable before the file fills the space.
 A variation is to allocate file space in large contiguous chunks, called extents. When a file outgrows
its original extent, then an additional one is allocated. ( For example an extent may be the size of a
complete track or even cylinder, aligned on an appropriate track or cylinder boundary. ) The high-
performance files system Veritas uses extents to optimize performance.

ii. Linked allocation

 Disk files can be stored as linked lists, with the expense of the storage space consumed by each link. (
E.g. a block may be 508 bytes instead of 512. )
 Linked allocation involves no external fragmentation, does not require pre-known file sizes, and
allows files to grow dynamically at any time.

 To create a new file, we simply create a new entry in the directory. With linked
allocation, each directory entry has a pointer to the first disk block of the file. This
pointer is initialized to nil (the end-of-list pointer value) to signify an empty file. The
size field is also set to 0. A write to the file causes the free-space management system
to find a free block, and this new block is written to and is linked to the end of the file.
To read a file, we simply read blocks by following the pointers from block to block.

Syeda Sumaiya Afreen pg. 20


R20 CMR Technical Campus

 Unfortunately linked allocation is only efficient for sequential access files, as random access requires
starting at the beginning of the list for each new location access.
 Another disadvantage is the space required for the pointers. If a pointer requires 4 bytes out of a 512-
byte block, then 0.78 percent of the disk is being used for pointers, rather than for information. Each
file requires slightly more space than it would otherwise need.
 Allocating clusters of blocks reduces the space wasted by pointers, at the cost of internal
fragmentation.
 Another big problem with linked allocation is reliability if a pointer is lost or damaged. Doubly
linked lists provide some protection, at the cost of additional overhead and wasted space.

Fig 7.2 : Linked allocation of disk space

 The File Allocation Table, FAT, used by DOS is a variation of linked


allocation, where all the links are stored in a separate table at the beginning
of the disk. A section of disk at the beginning of each volume is set aside to
contain the table. The table has one entry for each disk block and is indexed by
block number. The FAT is used in much the same way as a linked list. The
directory entry contains the block number of the first block of the file. The table
entry indexed by that block number contains the block number of the next block
in the file. The benefit of this approach is that the FAT table can be cached
in memory, greatly improving random access speeds. An illustrative example
Syeda Sumaiya Afreen pg. 21
R20 CMR Technical Campus

is the FAT structure shown in Figure 1.1.7 for a file consisting of disk blocks
217, 618, and 339.

Figure 7.3 : File-allocation table

iii. Indexed allocation

Linked allocation solves the external-fragmentation and size-declaration


problems of contiguous allocation. However, in the absence of a FAT, linked
allocation cannot support efficient direct access, since the pointers to the blocks are
scattered with the blocks themselves all over the disk and must be retrieved

in order.

Indexed allocation solves this problem by bringing all the pointers together into
one location: the index block. Each file has its own index block, which is an array of
disk-block addresses. The entry in the index block points to the block of the file. The
directory contains the address of the index block (Figure 7.4). To find and read the /th

Syeda Sumaiya Afreen pg. 22


R20 CMR Technical Campus

block, we use the pointer in the index-block entry.

Fig 7.4 : Indexed allocation of disk space.

When the file is created, all pointers in the index block are set to nil. When the ith
block is first written, a block is obtained from the free-space manager, and its address
is put in the zth index-block entry. Indexed allocation supports direct access, without
suffering from external fragmentation, because any free block on the disk can satisfy
a request for more space. Indexed allocation does suffer from wasted space, however.
Some disk space is wasted ( relative to linked lists or FAT tables ) because an entire index block
must be allocated for each file, regardless of how many data blocks the file contains. This leads to
questions of how big the index block should be, and how it should be implemented. There are
several approaches:

 Linked Scheme - An index block is one disk block, which can be read and written in a single disk

operation. The first index block contains some header information, the first N block addresses, and
if necessary a pointer to additional linked index blocks.
 Multi-Level Index - The first index block contains a set of pointers to secondary index blocks, which

in turn contain pointers to the actual data blocks.

Syeda Sumaiya Afreen pg. 23


R20 CMR Technical Campus

 Combined Scheme - This is the scheme used in UNIX inodes, in which the first 12 or so data block

pointers are stored directly in the inode, and then singly, doubly, and triply indirect pointers provide
access to more data blocks as needed. ( See below. ) The advantage of this scheme is that for small
files ( which many are ), the data blocks are readily accessible ( up to 48K with 4K block sizes );
files up to about 4144K ( using 4K blocks ) are accessible with only a single indirect block ( which
can be cached ), and huge files are still accessible using a relatively small number of disk accesses (
larger in theory than can be addressed by a 32-bit address, which is why some systems have moved
to 64-bit file pointers. )

Fig 7.5 : The UNIX inode

8. FREE SPACE MANAGEMENT

The memory space in the hard disk is limited. So we need to use the space of the deleted
files for the allocation of the new file. The system maintains a free space list by keep track of the
free disk blocks. These free blocks can be allocated to other new file or directory. When we
want to create a file, if the free space is available then this free space is allocated to the new file.
Otherwise file is not created. The process of finding and managing the free blocks of the disk is

Syeda Sumaiya Afreen pg. 24


R20 CMR Technical Campus

called free space management. The methods to implement a free space list are:

 Bitmap
 Linked list
 Grouping
 Counting

i. Bitmap or Bit Vector

A Bitmap or Bit Vector is series of binary bits (0 and 1) where each disk block is
represented by 1 bit.
 The bit 0 indicates the block is allocated
 The bit 1 indicates the block is free.

 The white color


block indicates
allocated to file
 The grey color box
indicates free block

Fig 8.1 : Bit vector method disk blocks

Let us consider the instance of disk blocks on the disk shown in the Figure 1 (where
white blocks are allocated and grey blocks are free) can be represented by a bitmap of 32 bits as:

Disk Block No
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0 0 1 1 1 1 0 0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0

Bit Vector

Syeda Sumaiya Afreen pg. 25


R20 CMR Technical Campus

The bit vector: 00111100111111000110000001110000

The main advantage of the bitmap is that it is simple to understand and efficient in finding the
free blocks in the disk.

ii. Linked List:


In this approach, the free disk blocks are linked together with the help of linked list..

 Free blocks are linked with each other

 A free block contains a pointer to the next free block

 The block number of the very first free disk block is stored at a separate location on disk
and it is called as free list head.

 The last free block would contain a null pointer indicating the end of free list.

 wastage of space.

Fig 8.2 : Linked List method of free disk blocks

In Fig above , the free space list head points to Block 2 which points to Block 3, the next
freeblock and so on. A drawback of this method is the more I/O required for free space list
traversal.

iii. Grouping

This approach forms groups based on the contiguous free blocks.

Syeda Sumaiya Afreen pg. 26


R20 CMR Technical Campus

 The first free block stores the address of first group of contiguous free blocks. 

 The last free block in the first group stores the address of second group of contiguous free
blocks and so on.

Fig 8.3 : Grouping method of free disk blocks

An advantage of this approach is that the addresses of a group of free disk blocks can be
found easily.

iv. Counting

This approach stores the address of the first free disk block and a number of free
contiguous disk blocks that follow the first block. Free space list contains address of first free
block and counts in each group of contiguous free disk blocks. This method of free space
management is similar to the method of allocating blocks. We can store these entries in the B-
tree in place of the linked list.

Syeda Sumaiya Afreen pg. 27


R20 CMR Technical Campus

Fig 8.4 : Counting method of free disk blocks

9. USAGE OF OPEN, CREATE, READ, WRITE, CLOSE, LSEEK,


STAT, IOCTL SYSTEM CALLS

i. open( ) System call:

The system call open( ) is used to open or create a file. The syntax is:

#include <sys/types.h>

#include <sys/stat.h>

#include <fcntl.h>

int open(const char *path, int flags, [mode_t mod]);

The number of arguments in this function can be two or three. The third argument is used
only when creating a new file. When we want to open an existing file only two arguments are
used. The function returns the smallest available file descriptor (fd value 0, 1 and 2 are reserved

for system purpose. Whenever a file is opened, for each file, an fd value is assigned from 3
onwards). This function returns the file descriptor (fd) value or in case of an error -1. Once the
file is opened, the file pointer is places on the first byte in the file. The argument “path”

Syeda Sumaiya Afreen pg. 28


R20 CMR Technical Campus

represents the file name. The second argument flags mention the type of operation to be
performed on the file. It can be
O_RDONLY: Opens the file for reading purpose.
O_WRONLY: Opens the file for writing purpose.
O_RDWR: The file is opened for reading and writing purpose.
O_APPEND: It writes to the end of the file.
O_CREAT: The file is created in case it not already exists.
O_TRUNC: If the file exists all of its content will be deleted.
The third argument, mod, is optional and used only when creating a new file. It is used to
define the file permissions. These include read, write or execute the file by the owner, group or
other users.

Owner: read, write, execute → S_IRUSR, S_IWUSR, S_IXUSR

Group: read, write, execute → S_IRGRP, S_IWGRP, S_IXGRP

Others: read, write, execute → S_IROTH, S_IWOTH, S_IXOTH

The above define the access rights for a file and they are defined in the sys/stat.h header.

ii. creat( ) system call:

The system call creat( ) is used to create a new file. The syntax is:

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int creat(const char *path, mode_t mod);

The argument “path” specifies the name of the file, while “mod” defines the access
rights. The access rights are given in the above topic ( open( ) system call ). If the file to be created
does not exist, a new i-node is allocated and a link is added to the directory. If the file exists, it
loses its contents and it will be opened for writing. In this case, the second argument is ignored
and the old ownership and the access permissions are not modified. This system call returns the
smallest file descriptor available. The function returns the file descriptor or in case of an error it
Syeda Sumaiya Afreen pg. 29
R20 CMR Technical Campus

returns the value -1. The system call creat ( ) is equivalent with:
open(path, O_WRONLY | O_CREAT | O_TRUNC, mod);

iii. read( ) system call:


When we want to read a certain number of bytes starting from the current position in a file,
we use the read system call. The syntax is:

#include <unistd.h>
ssize_t read(int fd, void* buf, size_t noct);

It reads “noct” bytes from the opened file referred by the file descriptor “fd” and it puts those
read bytes into a buffer “buf”. The pointer (current position) is incremented automatically after a
reading the given amount of bytes. The function returns the number of bytes read, 0 for end of
file (EOF) and -1 in case an error occurred.

iv. write( ) system call:

When we want to write a certain number of bytes into a file starting from the current position
we use the write system call. Its syntax is:

#include <unistd.h>
ssize_t write(int fd, const void* buf, size_t noct);

It writes “noct” bytes from the buffer “buf” into the opened file referred by the file descriptor

“fd”. The function returns the number of bytes written or -1 in case of an error.

v. close( ) system call:

This system call is used to close a file and release the assigned file descriptor “fd”.

#include <unistd.h>
int close(int fd);

The function returns 0 in case of successfully closing the file and -1 in case of an error. When the

process terminated, all the files opened by it are closed automatically.

Syeda Sumaiya Afreen pg. 30


R20 CMR Technical Campus

vi. lseek( ) System call:

To change the position of a file pointer in a file can be done by calling the lseek system call.

The syntax for lseek is:


#include <sys/types.h>

#include <unistd.h>
off_t lseek(int fd, off_t offset, int ref);

Syeda Sumaiya Afreen pg. 31


R20 CMR Technical Campus

The first argument “fd” refers file descriptor of an opened file. The second argument “offset”
refers number of positions to be moved. The third argument “ref” gives the position from where
the displacement of file pointer to be done.

 If “ref” is set to SEEK_SET the positioning is done from the beginning of the file.

 If “ref” is set to SEEK_CUR the positioning is done from the current position.

 If “ref” is set to SEEK_END then the positioning is done from the end of the file.

The function returns the new current position after displacement from the given file or -1 in case
of an error.

vii. stat( ) System calls:

The system call stat is used to read the attributes of a file. The syntax is:

#include <sys/types.h>
#include <sys/stat.h>
int stat(const char* path, struct stat* buf);

The first argument “path” gives the file name. The second argument “buf” is used to
store the file attributes read from the given i-node of a file. The file attributes can be file access
types, owner, file size, last access time, last modified time, etc. On success, the functions return
zero, and on error, −1 is returned. The structure struct stat is described in the sys/stat.h header and
has the following fields:

struct stat {
mode_t st_mode; /* file access types and rights */
ino_t st_ino; /* i-node */
dev_t st_dev; /* identifier of device containing file */
nlink_t st_nlink; /* nr of links */
uid_t st_uid; /* owner ID */
gid_t st_gid; /* group ID */

Syeda Sumaiya Afreen pg. 32


R20 CMR Technical Campus

off_t st_size; /* ordinary file size */


time_t st_atime; /* last time it was accessed */
time_t st_mtime; /* last time it was modified */
time_t st_ctime; /* last time settings were changed */

Syeda Sumaiya Afreen pg. 33


R20 CMR Technical Campus

long st_blksize; /* optimal size of the I/O block */


long st_blocks; /* nr of 512 byte blocks allocated */
};

viii. ioctl system call:

IOCTL is referred as Input and Output Control. The system call ioctl( ) is used to interact with
device driver files. The major use of this is to handle some specific operations of a device for
which the kernel does not have a system call by default. It manipulates the underlying device
parameters of device driver files. Some real time applications of ioctl( ) are:

 Ejecting the media from a “cd” drive


 to change the Baud Rate of Serial port

 Adjust the Volume

 Reading or Writing device registers, etc.


The syntax is:
#include <sys/ioctl.h>

int ioctl(int fd , int request, <Arguments> );

The first argument “fd” is a file descriptor of an opened file. The ioctl command needs to be
executed on this opened file, which would generally be device files. The second argument
“request” is a device-dependent request code. The request code varies from device to device.
The ioctl command implements the task associated with request code to achieve the desired
functionality. The third argument is an untyped pointer to memory. It's traditionally char

*argp. An ioctl( ) request has encoded in it whether the argument is an in parameter or out
parameter, and the size of the argument argp in bytes. Macros and defines used in specifying
an ioctl( ) request are located in the file <sys/ioctl.h>. Usually, on success zero or
positive value is returned. On error, -1 is returned.

Syeda Sumaiya Afreen pg. 34

You might also like