File Management
File Management
CHAPTER
FILE MANAGEMENT
Contents
> Introduction
> File Concept
" Definition of a File " File Attributes
" File Operations File Type
" File Structure " Internal File Structure
> File Access Methods
" Sequential Access Method Direct Access Method
" Indexed Access Method " Indexed Sequential Access Method
> Directory Structure
" Single-Level Directory " Two-Level Directory
+ Tree-Structured Directory Acyclic Graph Directory
" General Graph Directory Structure
> File Allocation Methods
"Contiguous Allocation " Linked List Allocation
" Indexed Allocation
> Free Space Management
" Bitmap or Bit Vector Linked List
" Grouping " Counting
º Directory Implementation
" Simple Directories: " Hierarchical Directory Systems
Directory Operations
º Efficiency and Performance
" Efficiency * Performance
Protection
" Types of Access " Access Control
Recovery
" Consistency Checking
º Review Questions " Backup and Restore
Operating) Systems File Manogement
SI Introduetion .
Soureor
ExecutablePrograms, Numeric or Tevt Data.
" Music.
The file svstem is the most visible aspect of an
ine storge of and access to both dataand programs of operatintheg system. lt provides o
operating system
Photos
computer system. he so
and
,
vary from one
operatingsystemto another but typicalyconsist of
" Adiretory structure, which organizes and
provides information about
attrilbutes
file name is the only
symbolic
usually a number
information kept in humanreadablethese:
form.
System. Ms The
Name: This : unique tag, for the file. identifies the file withinthe file it
The file system in the operating systemis capable of l .ldentifier:
non-human-readablename system: isthe
managing he fles
operating systeminditellsvidualus about:
presentinthe computer system. The file systemiin needed for
as informationis systems that support
time of creation and modification, type, and state of a fle wel as i
Location: informationis a pointer to a device and to thedifferent
This
This locationtypes
of the offilefiles.onthat devica,
majpresent
or directory computer grounso
TYpe:
chapter, we consider the various aspects of files and the on the Cze:TheCurrentsizeofthe file (in bytes, words, or blocks) and possiblythe maximum aflowed
the semantics of sharing files among multiple
processes, users. :
ways to handle file protection, necessary when we have and structures We aso IN
multiple userscomput systen. |inthis.
included
attribute.
and ewers.
AccesS-Contro/information determines who can do
waFnitnalyt,o we do
are
may access files and how files may be accessed.
File management is one of the basic but important
tE size
Protection: reading, writing, Executing
featuresthatprovided by cOTito and
soon.
Tine, date, and user jdentification: This information may be kept for ereation, last
handles themanages
management in operating system is nothing but software
text, pdf, docs, audio,video, etc.) present in computer software.
The main objectives of the file management system:
or
operatintheg system andlast use. These data can be useful for
modification,
monitoring.
protection, security, and usage
" It provides l/0support for a variety of storage device types. FileOperations
" Minimizes the chances of lost or destroyed data.
collection of logically
related data that is
recorded on the secondary storage in the form
. Helps OS to standardized I/0 interface
routines for user processec isa fonerations. The various operations
Afle which can be implemented on a file
operations. These operations are performed bysuchtheasuserread,by
osequence
of
It provides I/0support for multiple users in a multiuser are called ile
andclose etc.
systems thecommandsprovided by the operating system. Some common operations are as
open
8.2 File Concept environment. us0ng
wTite,
foilows:
Create Operation: This operation is used to create a file in the file system. It is the most
Computers can store information on various storage media, such widelyvused operation performed on the file system. To create anew file of aparticular type
as
and optical disks. So that the computer system will be convenient hard disks, magnetic tavg the associatedlapplication program calls the file system. This file system allocates space tothe
to use,
provides a uniform logical view of stored information, The the system knows the formatcof
c directory structure, so entry of this new file is
abstoperractatisng systen
operating system Ale. As the file made
physical properties of its storage devices to define alogical storage directory.
from into the appropriate
by the operating system onto physical devices. These storage devices unit, the file. Files are mappe Write Operation: This operation is used to write the information into afile. Asystem call write
are usually
contents are persistent between system reboots. non-volatle. is issued thatt specifies the name of the file and the length of the data has to be written to the
Bnoer the file length is increased by specified value and the file pointer is repositioned
8.2.1 Definition of a File
after the last byte written.
Afile is a named collection of related information that is
recorded on secondary storage. ARleran Read Operation: Toread from a file, we use a system call that specifes the name of the file
defined as a data structure which stores the sequence of records. and where (in memory) the next block of the file should be put. The directory is searched for
Files are stored in afile system, which may exist on a disk or in the the associated entry, and the system needs to keep a read pointer to the location in the file
main memory. From ausets
perspective, a file is the smallest allotment of logical secondary storage, that is, data cannot be wnta where the next read is to take place. Once the read has taken place, the read pointer is updated.
to secondary storage unless they are within a file. Because a process is usually either reading from or writing to a file, the current operation
The information in a file is defined by its creator. Many different types of information may be stord location can be kept as a per-process current file-position pointer. Both the read and write
in a file, such as: operations use this same pointer, saving space and reducing system complexity.
OperatingSystems
object
exe, com, bin or none
obj, o
Function
ready-to-run machine- language program
compiled, machine language,not inked
the
anderase Operation: Theuser
maywant
delete the
file and then recreate it, this fw
tkeep /sourcecode C, Cc, java, perl, asm
bat, sh source code in various
"
Truncate
Rather than
theuserto
forcing
unchanged exceptfor filelength--butlets the file be
reset function alowsa batch
markup xml, html, tex commands to the command languags
textual data,documents interpreter
to\enghlRrS
and itsfile space released.
remain Thesesixbasic operations comprise the minimal.set of reoquired fl
attributesto wordprocessor |sml,rtf, docx
lib, a so, dll various word-processor formats
library |libraries of
" Close Operation: Whenthe processing ofthe file is complete, it should be cdoseà so that a
operations. occupied printorView gif, pdí, jpPg routines for prograners
resources should released.
be ASCIlor binary file in a format for
andallthe were created when the printing or
made permanent
descriptorsthat ile was On viewing
thechanges
deallocates allthe
internal
Operation:This
operationadds datato
the end l of the file. opened. dosing larchive rar, zip, tar
related files grouped into one file, sonetimes
compressed, for archiving or storage
to rename the existing file. multimedia mpeg, mov, mp3, mp4, avi |binary file containing audio or A/V
" Append This operation
is used
Operation: | information
" Rename Table 8.1: Some of theCommon File Types
Fileto
File type refers Type
the ability of the operatingsystemto distinguish different types of file such as tet
S.2.5 File Structure
operating systems support many types of
fles source files and binary files etc. Many
UNIX have the
following types offiles - files. Operating AFile has a certain defined structure, which depends on its type.
System like MS-DOS and . Atext fille is a sequence of characters organized into lines (and possibly
pages).
1. Ordinary Files information. . Asource file is asequence of functions, each of which is further organized as dectarations
that contain user
" These are the files program. followed by executable statements.
databases or executable
" These may have text, such files like .An executable file is a series of code sections that the loader can bring
into memory and
" The user can apply
various operations on add, modify, delete execute.
Or even
remove the entire file.
AFile Structure should be according to a required format that the operating system can understan.
2. Directory Files
. A file has a certain defined structure according to its type.
These files contain list of file names and other intormation related to these floe
. A
text file is a sequence of characters organized into lines.
3. Special Files
" These files are also known as device files.
"Asource file is a sequence of procedures and functions.
" These files represent physical device like disks, terminals, printers, networks, tape drive " An object file is a sequence of bytes organized into blocks that are understandable by the
machine.
etc
These files are of two types - " When operating system defines different file structures, it also contains the code to support
Character special fles - data is handled character by character as in case these file structure. Unix, MS-DOS support minimum number of file structure.
terminals or printers. One of the disadvantages of having the operating system support multipBe file structures: the
" Block special files- data is handled in blocks as in the
case of disks and tapes.
resulting size of the operating system is cumbersome. If the operating system defines five different
ACommon technique for implementing file file structures, it needs to contain the code to support these file structures. In addition, it may be
types is to include the type as part of the file name. Ihe
name is split into two parts--a name and an necessary to define every file as one of the file types supported by the operating system. When new
in table 7.1. In this way, the user extension, usually separated by a period as sh0Wn beow applications require information structured in ways not supported by the operating system, severe
and the operating system can tell from the name alone wnat Ue problems may result.
l eSrere complltaledfor the Advantages of
Tnteraal
a le ran be hy the szeof operallng systen Hequenint Aeeea Methed
within
n olet letermlned
Intemalh, hnating deinedblock stse are th same slz, I Us Thls method of file
a uri recond), and all blocks lngaleod Liwl, unlikelv:
tpically h jbhystal provitley fast acee94 to tht
next rerd ystng
oiebu lengthofthe desited It
in nts ef wilevatymatch the blocks is a com.. ie*iogrphie nrdet
reords 0nto physieal to be
Iond eie
Irngih 'aking a
Ior evampBe
unber
the UNIN
of loglcal
operatingsystem
hy its olket
fiom
defines all ilex
the begnnlng (or end) of
packs And
ul lonl
slmply sltoams ot|
hierarchical manner.
To get the benefit of different file systems on the different operating systems, Ahard disk can be
divided into the number of partitions of different sizes. The partitions are also called volumes or mini
disks.
Each partition must have at least one directory in which, all the files of the partition can be listed. A
directory entry is maintained for each file in the directory which stores allthe information wl
that file.
Directory
D,
D, Dp D3 D4
F, F, F F. F F F
Files
D, D, D, D, D, D6 D, D,
f f. f
Files
There may chance of name collision because two files cannot have the same name.
c Searching will become time taking if directory will large.
C In this cannot group the same type of files together.
8.4.2 Two-Level Directory
As we have seen, a single level directory often leads to confusion of files names among different users.
the solution to this problem is to create a separate directory for each user.
In the two-level directory structure, each user has their own user files directory (UFD). The UFDs has
Similar structures, but each lists only the files of a single user. system's master file directory (MFD) is
Searches whenever a new user id=slogged in. The MFD is indexed by username or account number,
and each entry points to the UFD for that user.
QpeattngSy.tamt
user1 user2 user3 user4
Directory
D D D, D, D, D,
D, D
Files
Directory
Files
84.4
Acyclic Graph Directory
An acyclicgraphissa graph with no cycle and allows to share subdirectories and files. The same file or
subdirectories may be in two different directories. It is a natural generalization of the tree-structured
directory.
ris used in the situation like when two programmers are working on ajoint project and they need to
access files. The. associated files are stored in a subdirectory, separated them from other projects and
fles of otherprogrammers since they are working ona joint project so they want to the subdirectories
into their own directories. The common subdirectories should be shared. So here we use Acyclic
directories.
Root
Directory
Files
Operoting Stemo
Advantages of Aeyelie Graph Directory Alocation Methods
S We can
shar files. used to
kind of methods that are allocate disk spre Na st
Searching is easv duc to it will
directly afect the syitem
different-different paths. are
ditterentallocatipn because
iie method, we can utilize the tisk, and performatce
allocation
elect ihe hest
t sytem
the help ofthe
h ai
Disadvantages of Acyclie Graph Directory Renc
orWith
the
typesof
file allocations method: flescn he secxqed.
We share the files via various
LContiguous,
Allocation 2. Extents
linking, in case of
if the link is soft link, deleting iit may are
Linked
List
tAllocation
danglprinogblem,
In case of hard link, to
delete a file we have to delete all the with a 6. Indexed allocation
8.4.3
General Graph Directory Structure reference associated Ponter Indexed
d
allocation 8. Multilevel Indexed allocatio
In general graph
directory structure, cycles are allowed within a wilhit 9
Liinnked
2- n o d e
types of
file location methods, but we mainly use three types
directories can be derived from of fie liocation
more than one parent
directorv. directory different
structurewhspearce
are
The main problem with this kind of There
directory t Contiguousallocation
been taken by the files and structure is to methods:
Directory ContiguousAllocation
D, Contiguous.
5.1 allocation is one of the most used methods for allocation. Contiguous allocation means
D. D, D, D
D, D block in such a manner, so that in the hard disk, all the blocks get the
D. allocatethe contiguous
we block
physical
inthebelow
figure that in the directory, we have three files. Inthe tabie, we have mentioned
Wecansee the files. We can see in the table that for
Files thestartingblockand the length of all each file, we allocte
D, DËo acontiguousblock.
Figure 8.6 : General Graph Directory Structure File Name Start Lenght Allocated
Blocks
Advantages of GeneralGraph Directory Structure abc.text 0 3 0,1.2
It allows cycles. 3
video.mp4 4 2 4.5
S It is more flexible than other Blocks
directories structure. 9 10 11 jtp.docx 9 3 9,10,11
the Allocation
the 19h block and given diagram, that there is afile.
the length
Thus, will hold blocks 19,
it of the file is 6. So, theThe name of
LinkedList AIlocation
20, 21, 22, file the method overcomes the
23, 24.
occupies 6 blfileocksis'mail: T. inked
allocation
allocationmethod,
Iist
required that disk
drawbacks
of the
each fileis treated as a linked list of crontigunias lotion
disiks birc'ks tn the method.
blocks assigned to ipecifir hie inked ist
not
filemethod itis
directory entry comprises of a pointer for starting file are in the contiguous
Count
Count
file
Dirstartectory ation d i s kThe
chis
on
the
Eachdisk
blockbat is allocated or
hlock. nexrblock of the disk, which is
hiork ari ats for the
assigned to a ie cons:sts of a ginter
allocated to the same file and that
4
12 3)
tr
length
d er
Anating
tile
point
the
Example of
Linked List Allocation
6 figurethat we have a file named jeep. The value of the
ainter
14 )
mail t h e below start is 9. S,
from the gth block. and blocks are
sample
8 s e e t h e allocation
in
alocated in a random anneT. Thewe
12
9
13
101
14
list
19
28
We
canstart
to
have the
end dis25.
o f the block
It
means the allocation is finished on
(25) comprised
of -1. which means a
the 25 biock. We cam see in the
null pointer, and it wnß not below
gosnt to
6 raluethat
16 17 mail 18 another
fieure block
20 21 22 23 Directory
24 25 26 27
IS jeep
start
end
25
28 29 30 31 20 3
9 10 11
8
Figure 8.8: Example of 12/1 J14 15)
Contiguous Allocation 16
Advantages of Contiguous Allocation (2122 23
20
The advantages of contiguous
1. The contiguous
allocation are: 24 23-1 2627
allocation method gives excellent read 28293o31
2. Contiguous allocation is easy to performance
3. The contiguous
implement.
allocation method supports both types of file
sequential access and direct acces. access methods that Figure 8.9 : Example of Linked List Allocation
4 The Contiguous
allocation method is fast because, in this method of number seeks ie Advantages of Linked List Allocation
to the contiguous allocation of file blocks.
allocation:
Disadvantages of Contiguous Allocation There are various advantages of linked list
1 In liked list allocation, there is no external fragmentation. Due to this, we can utilize the
The disadvantages of contiguous allocation method are: memory better.
1. In the contiguous allocation method, sometimes disk can be fragmented. 2. In linked list allocation, a directory entry only comprises of the starting block address.
2. In this method, it is difficult to increase the size of the file due to the 3. The linked allocation method is flexible because we can quickBy increase the size of the file
availability of because, in this to allocate a file, we do not require a chunk of memory ina contiguous form.
contiguousmemory block.
8.16 File Managemen
Operating Systems
an individual index block. In the index block, the ith as the allocation.
In the
entry holdsstheindex block.
2 Multilevel
Index
We can see in the below figure that the disk
directory entry comprises of the
file
of
of thethe ith
Directory
indexalocatn
addraddreesss For each ile,fletNhere
index ole are
CombinedScheme
Scheme:- In
linked
Multilevel
irst-level
theelinked scheme, to holdIthe pointer, two or
Index: - In the
multilevel index, to point the
more than two index biocks
Linked together. Bach block contains the address of the next index block or a
pointer.
second-level index block, we
index block that in turn points to the blocks of the disk, occupied by the file. Weusecana
jeep
start
end 2
25 mnorethan 3levels depending on the maximum size of the
to 3 or file.
extendIthis up
Scheme: In a combined scheme, there is a special block which is
Combined called an
3.information
node (Inode). The inode comprises of all the information related to the file like
8 9 110 2 11 authority, name, size, etc. To store the disk block adaresses that contain the actual file, the
remaining space of inode is used. In inode, the starting pointer is used to point the direct
meansthe pointer comprises of the addresses of the disk blocks, which consist of
16 19 hlocks.This indirect blocks,the next few pointers are used.The
Tooiindicate the indirect blocks
20
thefile data. which are single indirect,, double indirect, and triple indirect.
three types,
are of
24 |25 -1 2627 Inode
28 |293o31 system, every file is indexed with the help of Inode. An Inode is a block that is
UNIX operating
ntheU is designed.
tod at the time when the file system
are varioustypes
of information included in Inode:
There
Atributes of the file, such as time stamp, permissions, details, ownership,etc.
Figure 8.10 : Index Allocation
The total number of direct blocks that comprise of the pointer to the starting blocks.
|Advantages of Index Allocation
3 In Inode, there is a single indirect pointer. It is used to point an index block. If using direct
The advantages of index allocation are: blocks, entire file cannot be indexed, then, in that case, we use asingle indirect pointer.
1. The index allocation method solves the problem of 4. Inode also contains a double indirect pointer. This pointer is used to point a disk block.
external fragmentation.
2. Index allocation provides direct 5, In Inode there is another pointer, which is a triple index pointer. This pointer isalso used to
access. point a disk block.
&.6. Free Space
Management
Afile system is responsibie to allocate the e
free blocks present it the disk. free bloks to the file
and keeps track of free spaces to There a system software
-allocate memorythereiore
is
it dvantag
a"fle
managerment
allocate and de-
systemin an operating system ha: 4
techniqueis
relativeysimpie
that maintains the record of free biocks There
is a free efficient tofind the free space on the disk
techniqueis very
When a file is creted. the space list Ths
allocated to save a file. Whieoperating system searches the .
Disadvantages
deletion afile, the file svstem free
frrees spae
the "free space list. i i. special hardware suppor to find he
the tehniquerequires a first 1in a word t is ot
There are some methods or e useful for the larger disks
techniques to implement ta free hs isnot
technique
" Bitrmap " Linked list space list
" Groupinig LinkedList
Counting nothertechniquefor free space management In tis iniked ist of al the free biock is
Bitmap or Bit Vector thereeis a head pointer which Doints the first free bincik of the ast
S6.1
this, onthe disk This block contains the pointer
ocation
edIn to the next biock and thewhich
setisbiock
kept
Abit vector is the most frequenty used
method to
knowa as a"Bit map.It is aseries or collection implement the free pointerrof another next and this process repeated by uSing this disk it is not
is
of spa: technique is not suffiient to traverse the ist because we have to easy te
values taken by the bits are either 1 or 0. if the bits in which each
the This
a list read each
thefree
block bit is 0, it means the block is not free.e it is
block bit ís 1, it requiresI/0time. Sotraversing in the free iist iS nOt afrequent acton
initialy so. eech bit in the bit vector represents 0. allocated to somemeas
files.
he hlockthat
Header
Example Nul
Gven beiow is a diagrammatic Pointer 4 5 9
representation of a dísk in which 2 13
some free and some occupied blocks
present. The upper part is there arel
are represented by 1 and occupied blocks are represented by '0. shorwing ol:. Advantages
Bits
Whenever a file is to be allocateda free biock, the operating system can simply allocate the
Biocks frst block in free space list and move the head pointer to the next free biock in the list.
1 2 3 4 6 7 8 10 11 12
1 1 1 1 Disadvantages
1 1 1:
Block is Biock is Coarching the free space list will be very time consuming each biock wali have to be read
aliocated free om the disk, which is read very slowBy as compared to the main memory.
Not Efficient for faster access.
Free block number can be defined as that block which
does not contain ary e: "
biocks. The formula to find a free block number is:
[Block number =(number of bits per words)"(number of 0-value word) Offse Grouping
:
We will Consider the first 8bits group (00111100011110) to constitute a non-2: Tis is also the technique of free space management. In this, there is a mod1ficat.on of the tree-ist
: COroach whích stores the address of the n free blocks. In this the first a-1 blocks are free but the
are not 0 here. "Non-zero word" is that word that contains the bit value 1. (3.:a:
at blockcontains the address of the n blocks. When we use the standard inked ist approach the
Here, the first non-ero word is the thírdblock of the group. So,the offset will be ?
resses of alarge number of blockscan be found very quickly. In this approach, we cannot keep a
Hence, the block number -8*0+3=3=8-0+3=3 Stofn fre disk addresses but we keep the address of the first free biock
Eile Management 813
&10Proteetion
information,is storedin a computer system, we want to keep it safe from physical damage (the
When reliability)and improper access (the issue of protection). Reliabilityis generally provided
duplicatecopies
offiles. Many computers have systems programsthat automatically (or through
homputer-operator intervention) copy disk files to tape at regular intervals (once per day or week
maintain
acopy should a file system be accidentally destroyed. File systems can be
month)to
or hardware problems (such as errors in reading or writing),
damagedby power surges or failures,
crashes, dirt, temperature extremes, and vandalism. Files may be deleted accidentally. Bugs in
head
thefile-systemsoftware can also cause file contents to be lost Protection can be provided in many
ways.Forasingle-user laptop system, we might provide protection by locking the computerin adesk
drawerorfile cabinet. In alarger multiuser system, however, other mechanisms are needed.
Types of Access
S.10.1
The need to protect files is a direct result of the ability to access files. Systems that do not permit
accesstothe fles of other users do not need protection. Thus, we could provide complete protection
by prohibitingaccess. Alternatively, we could provide free access with no protection. Both approaches
i needed is controlled access.
extreme for general use. Whatcis
are too
Protection mechanisms provide controlled access by limiting the types of file access that can be
de Access is permitted or denied depending on several factors, one of which is the type of access
renuested. Several difterent types of operations may be controlled:
" Read: Read from the file.
. Write: Write or rewrite the file.
.Execute: Load the file into memory and execute it.
"Append: Write new information at the end of the file.
. Delete: Delete the file and free its space for possible reuse.
. List: List the name and attributes of the file.
Other operations,such as renaming, copying, and editing the file, may also be controlled. For many
systems, however, these higher-level functions maybe implemented by a system program that makes
Iower-level system calls. Protection is provided at only the lower level. For instance, copying a file
maybe implemented simply by a sequence of read requests. In this case, a user with read access can
also cause the file to be copied,printed, and so on.
Many protection mechanisms have been proposed. Each has advantages and disadvantages and
must be appropriate for its intended application. Asmall computer system that is used by only a
EN members of a research group,for example, may not need the same types of protection as a large
oporate computer that is used for research, finance, and personnel operations.
8.10.2 Access Control
Tmost
of
common approach to the protection problem is to make access dependent on the identity
the user.
Different users may need different types of access to afile or directory. The most general
OperalingSystems
scheme toimplement identity dependent access is to
ontrollist (ACL) specifying user names and the typesassociate
of accesswith each file and
equests access to a particular file, the
file. If that user is listed for the operating system checks the for each allowed ireuser.ctory an a¬
requested access, the
violation occurs, and the user job is denied access to the access is list access
This approach hasthe advantage of
access lists is their length. If we wantenabling complex
file. allowed. as oci
Other wise, ated a
to allow everyoneaccess
methfile,odolweogiemusts. man prtertin
access. This technique has two undesirable consequences:
to read a The
list
Useprsroblewirmth witean
a
Constructing such alist may be atedious and
in unrewarding task,
advance the list users the system.
of in
" The directory entry,
previously of fixed size, now must be of
especial y if we
complicated space management. variable
These problems can be resolved by use of a
the length of the condensed version of
access-control list, many systems recognize threethe aCcess list. To
in
size,resulung mOre
connection with each file:
Owner: The user who created the file is the owner.
Group: Aset of users who are sharing the file and need
c las ifications condeISR
of
consistencychecker-a systems program -compares the data in the directory structure with
The on disk and tries to fix any
datablocks inconsistencies it finds. The allocation and free-space-
the
managementalgorithms dictate what types of problems the checker can find and how successtul it
will fixing
bein. them. For instance,if linked allocation is used andthere is alink from any block to its
nextblock., then the entire file can be reconstructed from the data blocks, and the directory structure
canberecreated.