0% found this document useful (0 votes)
21 views19 pages

Unit5 OS

Uploaded by

Ashwin Sriramoju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views19 pages

Unit5 OS

Uploaded by

Ashwin Sriramoju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

UNIT-V

File System Interface and Operations -Access methods, Directory Structure, Protection, File
System Structure, Allocation methods, Free-space Management. Usage of open, create, read,
write, close, lseek, stat, ioctl system calls.

1. INTRODUCTION TO FILE CONCEPTS


Computers can store information permanently on various storage media such as, magnetic
disks, magnetic tapes, optical disks. The physical storage content is converted into a logical
storage unit by Operating System. The logical storage unit is called FILE.

 A file is a collection of similar records.


 A record is a collection of related fields that can be treated as a unit by application
programs.
 A field is a basic element of data. Any individual field contains a single value.

FILE ATTRIBUTES:
 Name: A file is named for the convenience of the user and is referred by its name. A
name is usually a string of characters.
 Identifier: This unique tag, usually a number, identifies the file within the file system.
 Type: Files are of so many types. The type depends on the extension of the file. The
different types of files details are given in below table.
 Location: This is a pointer to the location of the file on storage device.
 Size: The current size of the file (in bytes, words, blocks).
 Protection: Access control information determines who can do reading, writing,
executing and so on.
 Time, Date, User identification: This information may be kept for creation, last
modification, and last use.

FILE OPERATIONS
 Creating a file: First OS check whether free space is available or not. If there is no free
space available, file can not be created. If the space is available then file is created and an

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


entry for the new file is added in the directory. The entry includes file attributes such as file
name, file location, size, etc.
 Writing a file: The OS search for the given file. If the file is not found, then a new file is
created with the given name. If the file is found, it opens the existing file. The system set a
write pointer to the location in the file where the next write is to take place. After each write
operation taken place, the write pointer is updated.
 Reading a file: To read a file, first of all we search the directories for the given file, if the
file is found; the system needs to keep a read pointer to the location in the file where the next
read is to take place. After each read operation taken place, the read pointer is updated.
 Repositioning within a file: This operation is also called file seek. The current file position
pointer is changed to a given value.
 Deleting a file: To delete a file, first search the directory for the given file name, then release
the file space and erase the directory entry.
 Truncating a file: To truncate a file, the file total content is erased but, the file exist as it is.

FILE TYPES:
The name of the file consists of 2 parts. One is name and second is extension. The file type is
depending on extension of the file. The extension of the file defines what type of file it is and
with what application software is used to open or run it.
File Type Extension Purpose
Executable .exe , .com , .bin Ready to run machine language program
Source code .c , .cpp , .asm Source code in various languages.
Object .obj , .o Compiled, machine language not linked
Batch .bat , .sh Commands to the command interpreter
Text .txt , .doc Textual data, documents
Word processor .doc , .wp , .rtf Various word processors or formats
Library .lib , .dll Library routines for programmers
.gif , .pdf , .jpg ASCII or Binary file in a format for
Print or View
printing or viewing
.arc , .zip Related files grouped into one file, sometimes
Archive
compressed for archiving or storage
Multimedia .mpeg , .mp3, .avi Binary file containing audio or A/V information

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


2. FILE ACCESS METHODS
Files stores information on secondary storage devices, this information must be accessed
and read into main memory. There are so many ways that the information in the file can be
accessed.
i. Sequential Access:
Information in the file is accessed in serial order i.e. one record after the other. Magnetic
tapes are supporting this type of file accessing.

For example, consider a file consisting of 100 records. Let the current position of read/write
head is at 45th record. Suppose next we want to read the 75th record then, it access sequentially
from 45, 46, 47 …….. 74, 75. Even though after 45th we need 75th, the read/write head traverse
all the records from 45 to 75. So, it is a time consuming method.

ii. Direct Access:


Direct access is also called relative access. Here records can read/write randomly without any
order. The direct access method is based on a disk model of a file, because disks allow random
access to any file block. For example, consider a disk containing of 512 blocks. Let the position
of read/write head is at 124th block. Suppose if the next block is to be read or write is 256th
block. Then we can jump from 124th block to 256th block directly without any restrictions.

Consider other example, let CD consists of 10 songs and at present we are listening song 3, if
we want to listen song 7, we can shift to 7. (In case of tape record cassette it is not possible)

iii. Indexed Sequential File Access

This access method is a combination of both the sequential access as well as direct access. The
file contains set of records and a group of records form a block. The main concept is to perform
direct access on the blocks and then sequential access on the records in a block. This access
method involves maintaining an index. The index is similar to an index in the text book. The
index contains 2 fields, a pointer field and a key field. To access data from a file, direct access is
performed to locate the block using pointer field and within the block the record is accessed
sequentially. Sometimes indexes may be big. So hierarchies of indexes are built in which one
direct access of an index leads to info to access another index directly and so on till the actual
file is accessed sequentially for the particular record.

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


3. DIRECTORY STRUCTURE
As a computer system consisting of thousands to millions of files, it is very hard to manage
them. To manage these files, directory concept was introduced. The files are grouped and load
each group into one partition called a directory. In Windows we also call these directories as
folders. A directory structure provides a mechanism for organizing many files in the file system.
A directory can contain multiple files. It can even have other directories inside of them. The
directory contains information about the file attributes such as location, ownership, size, etc. The
directory structures supported by OS are:
i. Single level directory:
This directory system contains only one directory called as root directory. All the files are saved
in this directory only. When the number of files increases or when the system has more than one
user, single level directory is not useful. Since all the files are in the same directory, they must
have the unique name . For example, if user-1 creates a files called sample and then later user-2
also creates a file called sample, then user-2’s file will overwrite user-1 file.

ii. Two level directory:


In the two-level directory structure, the root directory is called as master file directory (MFD).
Each user has their own user files directory (UFD). The MFD is indexed by username and each
entry points to the UFD for that user. The files of a particular user are stored in UFD. In this
model, Root directory is the MFD directory. The user1, user2, user3 and user4 are user level of
directories. F1, f2, …, f8 are files. Different users can have same file name. This is shown in the
below diagram.

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


iii. Tree structured directory:
Two level directory eliminates name conflicts among users but it is not satisfactory for users
with a large number (hundreds to millions) of files. To avoid this, each user can create the sub-
directories and load the same type of files into the sub-directory. Even a sub-directory can have
another sub-directory and so on. This can viewed as a tree like structure. So, here each user can
have as many as directories needed. The user can change his current directory whenever he
desires. If a file is not needed in the current directory then the user usually must either specify a
path name or change the current directory. Paths can be of two types:

a) Absolute Path: It Begins at root and follows a path down to the specified file.
Ex: \user2\programs\a1.java

b) Relative Path: Defines a path from current directory.


Ex: \programs\a1.java if user2 is the current directory.

iv. Acyclic graph directory


When multiple users are working on the same project, the project files can be stored in a
common sub-directory and those files are shared among those multiple users. This type of
directory is called acyclic graph directory. The common directory will be declared as a shared
directory. The graph contain no cycles with shared files, changes made by one user are made
visible to other users. A file may now have multiple absolute paths. When shared directory or
shared file is deleted, all pointers to the directory or files are also to be removed. The user1 and
user2 shares same directory called Programs. Similarly, user3 and user4 shares same file called
t3.txt. This is shown in the below diagram.

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


v. General graph directory:
When we add links to an existing tree structured directory, the tree structure is destroyed;
resulting is a simple graph structure. Cycles are allowed within a directory structure where
multiple directories can be derived from more than one parent directory. The advantage of this
type of directory is that traversing is easy and also sharing is possible.

4. PROTECTION
The information stored in a computer system should be protected from improper access.
Protection mechanisms provide controlled access by limiting types of file access that can be
made. Access is permitted or denied depending on several factors, such as the user type, the
access type requested.

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


Most common approach to the protection problem is to make access dependent on the
identity of the user. Different users need different types of access to a file. An access control list
(ACL) specifying user names and types of file access, OS checks the list (ACL) associated with
that file. If that user is listed for the requested access, the access is allowed. Otherwise protection
violation occurs, and user process is denied access to the file.

Access can be provided to the following class of users:

 Owner: The user who created the file is the owner.


 Group: A set of users who are sharing the file.
 Universe: All the other users in the system constitute the universe.

The access types of file can be:

 Read ( r ) : Read from the file


 Write ( w ) : Write/rewrite the file
 Execute ( x ) : load the file into memory & execute it
 No access ( - ): Not allowed to access

The general format of file access is given below:

Owner Group Universe


rwx rwx rwx

Example: file_name rwx rw- r- -


On the given file_name , the owner can perform read, write and execute, the group can perform
read and write, and the (other users) universe can perform only read access.

Other Protection approaches: Maintain password for each file.


Disadvantages

 Number of passwords that a user needs to remember may become large, if different
passwords set to different files.
 If only one password is used for all files, then once it is discovered, all files are
accessible.

5. FILE SYSTEM STRUCTURE


A file System must provide efficient mechanism to store the file, locate the file and retrieve the
file in a convenient way. Most of the Operating Systems use layering approach for every task
including file systems. Every layer of the file system is responsible for some activities. The

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


image shown below, elaborates how the file system is divided in different layers, and also the
functionality of each layer.

 When an application program asks for a file, the first request is directed to the logical file
system. The logical file system contains the Meta data of the file and directory structure.
It maintains file structure via file control blocks. A file control block (inode in Unix file
systems) contains information about the file, ownership, permissions, location of the file
contents. If the application program doesn't have the required permissions of the file then
this layer will throw an error. Logical file systems also verify the path to the file.

 Files are to be stored and retrieved from the hard disk. Hard disk is divided into various
tracks. Each track is divided into sectors. Each sector is divided into blocks. The file
content is divided into various logical blocks. Each logical block is mapped and stored
into Hard disk blocks. Therefore, in order to store and retrieve the files, the logical blocks
need to be mapped to physical blocks. This mapping is done by File organization module.
It is also responsible for free space management.

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


 Once File organization module decided which physical block the application program
needs, it passes this information to basic file system. The basic file system is responsible
for issuing the commands to I/O control in order to fetch those blocks.
 I/O controls contain the codes by using which it can access hard disk. These codes are
known as device drivers. I/O controls are also responsible for handling interrupts.

6. FILE SYSTEM IMPLEMENTATION


File system is implemented with the help of various disk data structures. This data structures may
vary depending upon the operating system type.

 Boot Control Block


Boot Control Block contains all the information which is needed to boot an operating system
from the Hard disk. It is called boot block in UNIX file system. It is called the partition boot
sector In NTFS (windows).

When you switch on computer, a special program called as BIOS stored in ROM is
executed by processor. BIOS contain the code to access the very first partition of hard disk called
Master boot record (MBR). The MBR contains the information regarding how and where the
Operating system is located in the hard disk so that it can be booted into the RAM. If the disk
does not contain an OS, this block can be empty.MBR also includes a partition table which
locates every partition in the hard disk.

 Volume Control Block

Volume control block contain all the information regarding that volume such as number of
blocks, size of each block, partition table, pointers to free blocks and free FCB blocks. In UNIX

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


file system, it is known as super block. In NTFS, this information is stored inside master file
table.

 Directory Structure (per file system)

A directory structure (per file system) contains file names and pointers to corresponding File
Control Blocks (FCBs). In UNIX, it includes inode numbers associated to file names.

 File Control Block

File Control block contains all the details about the file such as ownership details, permission
details, file size, etc. In UFS, this detail is stored in inode. In NTFS, this information is stored
inside master file table as a relational database structure. A typical file control block is shown in
the image below.
File Permissions
File Dates (Create, Access, Write)
File Owner, Group, ACL
File Size
File Data Blocks
Figure: File Control Block

When an application program needs a file, first, it must be opened. The open( ) call
passes a file name to the logical file system. The open( ) system call first searches the system
wide open file table to see if the file is already in use by another process. If it is, a per process
open file table entry is created pointing to the existing system wide open file table. If the file is
not already open, the directory structure is searched for the given file name. Once the file is
found, FCB is copied into a system wide open file table in memory. This table not only stores the
FCB but also tracks the number of processes that are using the file.

Next, an entry is made in the per – process open file table, with the pointer to the entry in
the system wide open file table and some other fields. These are the fields include a pointer to
the current location in the file ( for the next read/write operation) and the access mode in which
the file is open. The open () call returns a pointer to the appropriate entry in the per-process file
system table. All file operations are preformed via this pointer. When a process closes the file,
the per- process table entry is removed. And the system wide entry open count is decremented.
When all users that have opened the file completed their task then the file is closed, any updated

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


metadata is copied back to the disk base directory structure. System wide open file table entry is
removed.

7. FILE ALLOCATION METHODS


An allocation method refers to how disk blocks are allocated for files:

i. Contiguous allocation:
 A single continuous set of blocks are allocated to a file at the time of file creation.

 Each file occupies set of contiguous blocks.

 It requires only starting location (block #) and length (number of blocks).

 It is simple and gives best performance in most cases.

ii. Linked allocation


 Allocation is based on an individual block.

 When a file need more than one block, a linked list of blocks is maintained.

 In this each block contains a pointer to the next block in the chain and last block contains
NULL pointer.

 The directory maintains the file names with the starting and ending blocks as shown in
the diagram.

 This allocation method utilizes the free blocks effectively.

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


iii. Indexed allocation
 Each file has its own index block(s) of pointers to its data blocks.

 The file allocation table contains a separate one-level index for each file.

 If we need some data which is available in a particular block# (number) no need to


traverse from the starting block of the file just like linked list allocation.

 We can get all the block numbers allocated to a file from the index block.

 The directory structure contain file name and its associated index block.

8. FREE SPACE MANAGEMENT


The memory space in the hard disk is limited. So we need to use the space of the deleted
files for the allocation of the new file. The system should maintain a free space list by keep track
of the free disk blocks. These free blocks can be allocated to other new file or directory. When
we want to create a file, if the free space is available then this free space is allocated to the new
file. Otherwise file is not created. The process of finding and managing the free blocks of the
disk is called free space management. The methods to implement a free space list are:

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


 Bitmap
 Linked list
 Grouping
 Counting

i. Bitmap or Bit Vector


A Bitmap or Bit Vector is series of binary bits (0 and 1) where each bit corresponds to
one disk block.
 The bit 0 indicates the block is allocated
 The bit 1 indicates the block is free.

The white color block indicates allocated to file

The grey color box indicates free block

Figure 1: Bit vector method disk blocks

Let us consider the instance of disk blocks on the disk shown in the Figure 1 (where
white blocks are allocated and grey blocks are free) can be represented by a bitmap of 32 bits as:
Disk Block No
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0 0 1 1 1 1 0 0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0

Bit Vector
The bit vector: 00111100111111000110000001110000
The main advantage of the bitmap is that it is simple to understand and efficient in finding the
free blocks in the disk.

ii. Linked List:


In this approach, the free disk blocks are linked together with the help of linked list..
 Free blocks are linked with each other
 A free block contains a pointer to the next free block
 The block number of the very first free disk block is stored at a separate location on disk
and it is called as free list head.

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


 The last free block would contain a null pointer indicating the end of free list.

Figure 2: Linked List method of free disk blocks


In Figure-2, the free space list head points to Block 2 which points to Block 3, the next free
block and so on. A drawback of this method is the more I/O required for free space list traversal.

iii. Grouping
This approach forms groups based on the contiguous free blocks.
 The first free block stores the address of first group of contiguous free blocks.
 The last free block in the first group stores the address of second group of contiguous free
blocks and so on.

Figure 3: Grouping method of free disk blocks

An advantage of this approach is that the addresses of a group of free disk blocks can be
found easily.

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


iv. Counting
This approach stores the address of the first free disk block and a number of free
contiguous disk blocks that follow the first block. Free space list contains address of first free
block and counts in each group of contiguous free disk blocks. This method of free space
management is similar to the method of allocating blocks. We can store these entries in the B-
tree in place of the linked list.

Figure 4: Counting method of free disk blocks

9. USAGE OF OPEN, CREATE, READ, WRITE, CLOSE, LSEEK,


STAT, IOCTL SYSTEM CALLS
i. open( ) System call:
The system call open( ) is used to open or create a file. The syntax is:

#include <sys/types.h>

#include <sys/stat.h>

#include <fcntl.h>
int open(const char *path, int flags, [mode_t mod]);

The number of arguments in this function can be two or three. The third argument is used
only when creating a new file. When we want to open an existing file only two arguments are
used. The function returns the smallest available file descriptor (fd value 0, 1 and 2 are reserved

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


for system purpose. Whenever a file is opened, for each file, an fd value is assigned from 3
onwards). This function returns the file descriptor (fd) value or in case of an error -1. Once the
file is opened, the file pointer is places on the first byte in the file. The argument “path”
represents the file name. The second argument flags mention the type of operation to be
performed on the file. It can be
O_RDONLY: Opens the file for reading purpose.
O_WRONLY: Opens the file for writing purpose.
O_RDWR: The file is opened for reading and writing purpose.
O_APPEND: It writes to the end of the file.
O_CREAT: The file is created in case it not already exists.
O_TRUNC: If the file exists all of its content will be deleted.
The third argument, mod, is optional and used only when creating a new file. It is used to
define the file permissions. These include read, write or execute the file by the owner, group or
other users.

Owner: read, write, execute → S_IRUSR, S_IWUSR, S_IXUSR


Group: read, write, execute → S_IRGRP, S_IWGRP, S_IXGRP
Others: read, write, execute → S_IROTH, S_IWOTH, S_IXOTH
The above define the access rights for a file and they are defined in the sys/stat.h header.

ii. creat( ) system call:


The system call creat( ) is used to create a new file. The syntax is:
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int creat(const char *path, mode_t mod);

The argument “path” specifies the name of the file, while “mod” defines the access
rights. The access rights are given in the above topic ( open( ) system call ). If the file to be created
does not exist, a new i-node is allocated and a link is added to the directory. If the file exists, it
loses its contents and it will be opened for writing. In this case, the second argument is ignored
and the old ownership and the access permissions are not modified. This system call returns the

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


smallest file descriptor available. The function returns the file descriptor or in case of an error it
returns the value -1. The system call creat ( ) is equivalent with:
open(path, O_WRONLY | O_CREAT | O_TRUNC, mod);

iii. read( ) system call:


When we want to read a certain number of bytes starting from the current position in a file,
we use the read system call. The syntax is:

#include <unistd.h>
ssize_t read(int fd, void* buf, size_t noct);

It reads “noct” bytes from the opened file referred by the file descriptor “fd” and it puts those
read bytes into a buffer “buf”. The pointer (current position) is incremented automatically after a
reading the given amount of bytes. The function returns the number of bytes read, 0 for end of
file (EOF) and -1 in case an error occurred.

iv. write( ) system call:


When we want to write a certain number of bytes into a file starting from the current position
we use the write system call. Its syntax is:
#include <unistd.h>
ssize_t write(int fd, const void* buf, size_t noct);

It writes “noct” bytes from the buffer “buf” into the opened file referred by the file descriptor
“fd”. The function returns the number of bytes written or -1 in case of an error.

v. close( ) system call:


This system call is used to close a file and release the assigned file descriptor “fd”.
#include <unistd.h>
int close(int fd);

The function returns 0 in case of successfully closing the file and -1 in case of an error. When the
process terminated, all the files opened by it are closed automatically.

vi. lseek( ) System call:


To change the position of a file pointer in a file can be done by calling the lseek system call.
The syntax for lseek is:

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


#include <sys/types.h>

#include <unistd.h>
off_t lseek(int fd, off_t offset, int ref);

The first argument “fd” refers file descriptor of an opened file. The second argument “offset”
refers number of positions to be moved. The third argument “ref” gives the position from where
the displacement of file pointer to be done.
 If “ref” is set to SEEK_SET the positioning is done from the beginning of the file.
 If “ref” is set to SEEK_CUR the positioning is done from the current position.
 If “ref” is set to SEEK_END then the positioning is done from the end of the file.
The function returns the new current position after displacement from the given file or -1 in case
of an error.

vii. stat( ) System calls:


The system call stat is used to read the attributes of a file. The syntax is:
#include <sys/types.h>
#include <sys/stat.h>
int stat(const char* path, struct stat* buf);

The first argument “path” gives the file name. The second argument “buf” is used to
store the file attributes read from the given i-node of a file. The file attributes can be file access
types, owner, file size, last access time, last modified time, etc. On success, the functions return
zero, and on error, −1 is returned. The structure struct stat is described in the sys/stat.h header and
has the following fields:

struct stat {
mode_t st_mode; /* file access types and rights */
ino_t st_ino; /* i-node */
dev_t st_dev; /* identifier of device containing file */
nlink_t st_nlink; /* nr of links */
uid_t st_uid; /* owner ID */
gid_t st_gid; /* group ID */
off_t st_size; /* ordinary file size */
time_t st_atime; /* last time it was accessed */
time_t st_mtime; /* last time it was modified */
time_t st_ctime; /* last time settings were changed */

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR


long st_blksize; /* optimal size of the I/O block */
long st_blocks; /* nr of 512 byte blocks allocated */
};

viii. ioctl system call:

IOCTL is referred as Input and Output Control. The system call ioctl( ) is used to interact with
device driver files. The major use of this is to handle some specific operations of a device for
which the kernel does not have a system call by default. It manipulates the underlying device
parameters of device driver files. Some real time applications of ioctl( ) are:

 Ejecting the media from a “cd” drive


 to change the Baud Rate of Serial port
 Adjust the Volume
 Reading or Writing device registers, etc.
The syntax is:
#include <sys/ioctl.h>

int ioctl(int fd , int request, <Arguments> );

The first argument “fd” is a file descriptor of an opened file. The ioctl command needs to be
executed on this opened file, which would generally be device files. The second argument
“request” is a device-dependent request code. The request code varies from device to device.
The ioctl command implements the task associated with request code to achieve the desired
functionality. The third argument is an untyped pointer to memory. It's traditionally char
*argp. An ioctl( ) request has encoded in it whether the argument is an in parameter or out
parameter, and the size of the argument argp in bytes. Macros and defines used in specifying
an ioctl( ) request are located in the file <sys/ioctl.h>. Usually, on success zero or
positive value is returned. On error, -1 is returned.

Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR

You might also like