0% found this document useful (0 votes)

5 views23 pages

File Management Module-5

Uploaded by

PALLAVI Y

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views23 pages

File Management Module-5

Uploaded by

PALLAVI Y

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Implementation of File System

o File Attributes
o File Operations
o File Types
o File Structure
o Internal File Structure

Access Methods

o Sequential Access
o Direct Access
o Other Access Methods – Indexed Access

Directory and Disk Structure

o Storage Structure
o Directory Overview
o Single-Level Directory
o Two-Level Directory
o Tree –Structured Directories
o Acyclic –Graph Directories
o General Graph Directory

File System Mounting

File Sharing
o Multiple Users
o Remote File Systems
 The Client-Server Model
 Distributed Information Systems
 Failure Modes
Consistency Semantics
o Unix Semantics
o Session Semantics
o Immutable – Shared Files Semantics
Protection – Types, Access Control, other approaches
File-System Structure
File-System Implementation
Directory Implementation
Allocation Methods
Free-Space Management
File Concepts
The file system consists of two distinct parts: a collection of files, each storing related data, and a
directory structure, which organizes and provides information about all the files in the system.
A file is a collection of related information that is recorded on secondary storage. A file is
a sequence of bits, bytes, lines, or records, the meaning of which is defined by the file's creator
and user.

a) File Attributes
A file is named, for the convenience of its human users, and is referred to by its name. A file's
attributes vary from one operating system to another .The attributes of a file are p-
o Name - The symbolic file name is the only information kept in humanreadable form.
Some special significance is given to names, and particularly extensions ( .exe, .txt, etc. ).
o Identifier – It is a unique number, that identifies the file within the file system.
o Type – Type of the file like text, executable, other binary, etc.
o Location - . location of the file on that device.
o Size - The current size of the file (in bytes, words, or blocks)
o Protection - Access-control information ( reading, writing, executing).
o Time, date, and user identification –These data can be useful for protection, security,
and usage monitoring.

b) File Operations
The operating system provides system calls to create, write, read, reposition, delete,
and truncate files.
o Creating a file - Two steps are necessary to create a file
 Find space in the file system for the file.
 Make an entry for the new file in the directory.
o Writing a file - To write a file, the system call consists of both the name of the
file and the information to be written to the file. Given the name of the file,
the system searches the directory to find the file's location. The system
must keep a write pointer to the location in the file where the next write is
to take place. The write pointer must be updated whenever a write occurs.
o Reading a file - To read from a file, the system call that specifies the name of the
file and where the next block of the file should be put. The directory is
searched for the file, and the system needs to keep a read pointer to the
location in the file where the next read is to take place. Once the read has
taken place, the read pointer is updated.
o Repositioning within a file - The directory is searched for the file, and the file
pointer is repositioned to a given value. This file operation is also known
as a file seek.
o Deleting a file – To delete a file, search the directory for the file. Release all file
space, so that it can be reused by other files, and erase the directory entry.
o Truncating a file - The user may want to erase the contents of a file but keep its
attributes. Rather than forcing the user to delete the file and then recreate
it, this function allows all attributes to remain unchanged –except for file
length. The file size is reset to zero.
Open file table
Information about currently open files is stored in an open file table. It contains
information like:
o File pointer - records the current position in the file, for the next read or write
access.
o File-open count - How many times has the current file been opened by different
processes, at the same time and not yet closed? When this counter reaches zero
the file can be removed from the table.
o Disk location of the file – The information needed to locate the file on disk is
kept in memory so that the system does not have to read it from disk for each
opearation.
o Access rights – The file access permissions are stored on the per-process table so
that the operating system can allow or deny subsequent I/O requests.
Some systems provide support for file locking.
o A shared lock is for reading only.
o A exclusive lock is for writing as well as reading.
o An advisory lock , it is up to the software developers to ensure that locks are
acquired or realeased.
o A mandatory lock , prevents any other process from accessing the locked file.. ( A
truly locked door. )
o UNIX uses advisory locks, and Windows uses mandatory locks.

c) File Types
 File name consists of two parts: name and extension
 The user and the operating system can identify the type of a file using the name.
 Most operating systems allow users to specify a file name as a sequence of
characters followed by a period and terminated by an extension. Example :
resume.doc, threads.c etc.
 The system uses the extension to indicate the type of the file and the type of
operations that can be done on that file.
 or instance, only a file with a ".corn", ".exe", or ".bat”, can be executed.

d) File Structure
The study of different ways of storing files in secondary memory such that they can be
easily accessed.
File types can be used to indicate the internal structure of the file. Certain files must be in
a particular structure that is understood by the operating system.
For example, the operating system requires that an executable file have a specific
structure so that it can determine where in memory to load the file and the location of the
first instruction.
UNIX treats all files as sequences of bytes, with no further consideration of the internal
structure.
Macintosh files have two forks - a resource fork, and a data fork. The resource fork
contains information relating to the UI, such as icons and button images. The data fork
contains the traditional file contents-program code or data.

e) Internal File Structure

 Disk systems typically have a well-defined block size determined by the size of a
sector. A group of sectors form a group
 All disk I/0 is performed in units of one block, and all blocks are the same size.
 Logical records may even vary in length. Padding a number of logical records into
physical blocks is a common solution to this problem.
 The packing can be done either by the user's application program or by the
operating system. In either case, the file may be considered a sequence of blocks.
 All the basic I/O functions operate in terms of blocks.
 Disk space is always allocated in terms of blocks. Some portion of last block(while
storing a file) is always wasted. This is called internal fragmentation.
Access Methods

The file information is accessed and read into computer memory. The information in the file can
be accessed in several ways.

a) Sequential Access
Here information in the file is processed in order, one record after the other.
This mode of access is a common method; for example, editors and compilers usually
access files in this fashion.
A sequential access file emulates magnetic tape operation, and generally supports a
few operations:
o read next - read a record and advance the file pointer to the next position.
o write next - write a record to the end of file and advance the file pointer to the
next position.
o skip n records - May or may not be supported. ‘n’ may be limited to positive
numbers, or may be limited to +/- 1.

a) Direct Access
A file is made up of fixed-length logical records that allow programs to read and write
records randomly. The records can be rapidly accessed in any order.

Direct access are of great use for immediate access to large amount of information.
Eg : Database file. When a query occurs, the query is computed and only the selected rows are
access directly to provide the desired information.
Operations supported include:
read n - read record number n. (position the cursor to n and then read the record)
write n - write record number n. (position the cursor to n and then write the
record)
jump to record n – move to nth record (n- could be 0 or the end of file)
If the record length is L, there is a request for record ‘N’. Then the direct access to
the starting byte of record ‘N’ is at L*(N-1)

Eg: if 3rd record is required and length of each record(L) is 50, then the starting
position of 3rd record is L*(N-1)
Address = 50*(3-1) = 100.

b) Other Access Methods(Indexed method)

These methods generally involve the construction of an index for the file called index
file.
The index file is like an index page of a book, which contains key and address. To find a
record in the file, we first search the index and then use the pointer to access the record
directly and find the desired record.
An indexed access scheme can be easily built on top of a direct access system.
For very large files, the index file itself is very large. The solution to this is to create an
index for index file. i.e. multi-level indexing.
Directory Structure

Directory is a structure which contains filenames and information about the files like
location, size, type etc. The files are put in different directories. Partitioning is useful for limiting
the sizes of individual file systems, putting multiple file-system types on the same device, or
leaving part of the device available for other uses.
Partitions are also known as slices or minidisks. A file system can be created on each of
these parts of the disk. Any entity containing a file system is generally known as a volume.

Directory Overview

The directory can be viewed as a symbol table that translates file names into their directory
entries.
Directory operations to be supported include:
o Search for a file - search a directory structure to find the entry for aparticular file.
o Create a file – create new files and add to the directory
o Delete a file - When a file is no longer needed, erase it from the directory
o List a directory - list the files in a directory and the contents of thedirectory entry.
o Rename a file – Change the name of the file. Renaming a file may also allow its
position within the directory structure to be changed.
o Traverse the file system - Access every directory and every file within adirectory
structure.
Directory Structures -
a) Single-Level Directory
It is the simplest directory structure.
All files are contained in the same directory, which is easy to support and understand.

The limitations of this structure is that -

All files are in the same directory must have unique names.
Even a single user on a single-level directory may find it difficult to remember the names
of all the files as the number of files increases.

a) Two-Level Directory
Each user gets their own directory space - user file directory(UFD)
File names only need to be unique within a given user's directory.
A master file directory(MFD) is used to keep track of each users directory, and
must be maintained when users are added to or removed from the system.
When a user refers to a particular file, only his own UFD is searched.
All the files within each UFD are unique.
To create a file for a user, the operating system searches only that user's UFD to
ascertain whether another file of that name exists.
To delete a file, the operating system confines its search to the local UFD; thus, it
cannot accidentally delete another user's file that has the same name. The user
directories themselves must be created and deleted as necessary.
This structure isolates one user from another. Isolation is an advantage when the
users are completely independent but is a disadvantage when the users want to
cooperate on some task and to access one another's files.
b) Tree-Structured Directories
A tree structure is the most common directory structure.
The tree has a root directory, and every file in the system has a unique path name.
A directory (or subdirectory) contains a set of files or subdirectories.
One bit in each directory entry defines the entry as a file (0) or as a subdirectory (1).
Special system calls are used to create and delete directories.
Path names can be of two types: absolute and relative. An absolute path begins at the
root and follows a down to the specified file, giving the directory names on the path.
A relative path defines a path from the current directory.
For example, in the tree-structured file system of figure below if the current directory
is root/spell/mail, then the relative path name is prt/first and the files absolute path
name root/spell/mail/prt/jirst.
Directories are stored the same as any other file in the system, except there is a bit
that identifies them as directories, and they have some special structure that the OS
understands.
One question for consideration is whether or not to allow the removal of directories
that are not empty - Windows requires that directories be emptied first, and UNIX
provides an option for deleting entire sub-trees.

a) Acyclic-Graph Directories
When the same files need to be accessed in more than one place in the directory
structure ( e.g. because they are being shared by more than one user), it can be useful
to provide an acyclic-graph structure. ( Note the directed arcs from parent to child. )
o UNIX provides two types of links (pointer to another file)for implementing the
acyclic-graph structure.
o A hard link ( usually just called a link ) involves multiple directory entries that
both refer to the same file. Hard links are only valid for ordinary files in the same
filesystem.
o A symbolic link, that involves a special file, containing information about where
to find the linked file. Symbolic links may be used to link directories and/or files
in other file systems, as well as ordinary files in the current file system.
Windows only supports symbolic links, termed shortcuts.
Hard links require a reference count, or link count for each file, keeping track of
how many directory entries are currently referring to this file. Whenever one of the
references is removed the link count is reduced, and when it reaches zero, the disk
space can be reclaimed.
For symbolic links there is some question as to what to do with the symbolic links
when the original file is moved or deleted:
o One option is to find all the symbolic links and adjust them also
o Another is to leave the symbolic links dangling, and discover that they are no
longer valid the next time they are used.
o What if the original file is removed, and replaced with another file having the
same name before the symbolic link is next used?

Another approach to deletion is to preserve the file until all references to it are deleted. To
implement this approach, we must have some mechanism for determining that the last
reference to the file has been deleted.
When a link or a copy of the directory entry is established, a new entry is added to the file-
reference list. When a link or directory entry is deleted, we remove its entry on the list. The
file is deleted when its file-reference list is empty.
General Graph Directory
If cycles are allowed in the graphs, then several problems can arise:
o Search algorithms can go into infinite loops. One solution is to not follow links in
search algorithms. (Or not to follow symbolic links, and to only allow symbolic links
to refer to directories)
o Sub-trees can become disconnected from the rest of the tree and still not have
their reference counts reduced to zero. Periodic garbage collection is required to
detect and resolve this problem. (chkdsk in DOS and fsck in UNIX search for these
problems, among others, even though cycles are not supposed to be allowed in either
system. Disconnected disk blocks that are not marked as free are added back to the
file systems with made-up file names, and can usually be safely deleted. )

File-System Mounting

 The basic idea behind mounting file systems is to combine multiple file systems into one
large tree structure.
 The mount command is given a filesystem to mount and a mount point (directory) on
which to attach it.
 Once a file system is mounted onto a mount point, any further references to that directory
actually refer to the root of the mounted file system.
 Any files ( or sub-directories ) that had been stored in the mount point directory prior to
mounting the new filesystem are now hidden by the mounted filesystem, and are no
longer available. For this reason some systems only allow mounting onto empty
directories.
 Filesystems can only be mounted by root, unless root has previously configured certain
filesystems to be mountable onto certain pre-determined mount points. (E.g. root may
allow users to mount floppy filesystems to /mnt or something like it) Anyone can run the
mount command to see what filesystems are currently mounted.
 Filesystems may be mounted read-only, or have other restrictions imposed.
The traditional Windows OS runs an extended two-tier directory structure, where the first tier of
the structure separates volumes by drive letters, and a tree structure is implemented below that
level.

 Macintosh runs a similar system, where each new volume that is found is automatically
mounted and added to the desktop when it is found.
 More recent Windows systems allow filesystems to be mounted to any directory in the
filesystem, much like UNIX.

File Sharing

Multiple Users

 On a multi-user system, more information needs to be stored for each file:

o The owner ( user ) who owns the file, and who can control its access.
o The group of other user IDs that may have some special access to the file.
o What access rights are afforded to the owner ( User ), the Group, and to the rest of
the world ( the universe, a.k.a. Others. )
o Some systems have more complicated access control, allowing or denying
specific accesses to specifically named users or groups.

Remote File Systems

 The advent of the Internet introduces issues for accessing files stored on remote
computers
o The original method was ftp, allowing individual files to be transported across
systems as needed. Ftp can be either account and password controlled,
or anonymous, not requiring any user name or password.
o Various forms of distributed file systems allow remote file systems to be mounted
onto a local directory structure, and accessed using normal file access commands.
( The actual files are still transported across the network as needed, possibly using ftp
as the underlying transport mechanism. )
o The WWW has made it easy once again to access files on remote systems without
mounting their filesystems, generally using ( anonymous ) ftp as the underlying
file transport mechanism.

a) The Client-Server Model

 When one computer system remotely mounts a filesystem that is physically

located on another system, the system which physically owns the files acts as
a server, and the system which mounts them is the client.
 User IDs and group IDs must be consistent across both systems for the system to
work properly. ( I.e. this is most applicable across multiple computers managed
by the same organization, shared by a common group of users. )
 The same computer can be both a client and a server. ( E.g. cross-linked file
systems. )
 There are a number of security concerns involved in this model:
o Servers commonly restrict mount permission to certain trusted systems only.
Spoofing ( a computer pretending to be a different computer ) is a
potential security risk.
o Servers may restrict remote access to read-only.
o Servers restrict which filesystems may be remotely mounted. Generally the
information within those subsystems is limited, relatively public, and
protected by frequent backups.
 The NFS ( Network File System ) is a classic example of such a system.

b) Distributed Information Systems

 The Domain Name System, DNS, provides for a unique naming system across
all of the Internet.
 Domain names are maintained by the Network Information System, NIS, which
unfortunately has several security issues. NIS+ is a more secure version, but has
not yet gained the same widespread acceptance as NIS.
 Microsoft's Common Internet File System, CIFS, establishes a network
login for each user on a networked system with shared file access. Older
Windows systems used domains, and newer systems ( XP, 2000 ), use active
directories. User names must match across the network for this system to be
valid.
 A newer approach is the Lightweight Directory-Access Protocol, LDAP, which
provides a secure single sign-on for all users to access all resources on a
network. This is a secure system which is gaining in popularity, and which has
the maintenance advantage of combining authorization information in one
central location.

c) Failure Modes

 When a local disk file is unavailable, the result is generally known immediately,
and is generally non-recoverable. The only reasonable response is for the response
to fail.
 However when a remote file is unavailable, there are many possible reasons, and
whether or not it is unrecoverable is not readily apparent. Hence most remote
access systems allow for blocking or delayed response, in the hopes that the
remote system ( or the network ) will come back up eventually.

Consistency Semantics

 Consistency Semantics deals with the consistency between the views of shared
files on a networked system. When one user changes the file, when do other users
see the changes?
 The series of accesses between the open() and close() operations of a file is called
the file session.

Examples of consistency semantics -

File-System Structure

Disks provide the bulk of secondary storage on which a file system is maintained. The two
characteristics that make them a convenient medium for storing multiple files:
1. A disk can be rewritten in place; it is possible to read a block from the disk, modify the
block, and write it back into the same place.
2. A disk can access directly any given block of information it contains. Thus, it is simple to
access any file either sequentially or randomly, and switching from one file to another
requires only moving the read-write heads and waiting for the disk to rotate.

 To improve I/O efficiency, I/O transfers between memory and disk are performed in units
of blocks. Block sizes may range from 512 bytes to 4K or larger.( Rather than
transferring a byte at a time,)
 A file system poses two quite different design problems.
 The first problem is defining how the file system should look to the user. This
task involves defining a file and its attributes, the operations allowed on a file, and
the directory structure for organizing files.
 The second problem is creating algorithms and data structures to map the logical
file system onto the physical secondary-storage devices.

 File systems organize storage on disk drives, and can be viewed as a layered design:
o At the lowest layer are the physical devices, consisting of the magnetic media,
motors & controls, and the electronics connected to them and controlling them.
Modern disk put more and more of the electronic controls directly on the disk drive
itself, leaving relatively little work for the disk controller card to perform.
o Lowest level, I/O Control consists of device drivers, which communicate with the
devices by reading and writing special codes directly to and from memory addresses
corresponding to the controller card's registers. Each controller card ( device ) on a
system has a different set of addresses ( registers, ports ) that it listens to, and a
unique set of command codes and results codes that it understands.
o The basic file system level works directly with the device drivers in terms of
retrieving and storing raw blocks of data, without any consideration for what is in
each block.
o The file organization module knows about files and their logical blocks, and how
they map to physical blocks on the disk. In addition to translating from logical to
physical blocks, the file organization module also maintains the list of free blocks,
and allocates free blocks to files as needed.
o The logical file system deals with all of the meta data associated with a file (UID,
GID, mode, dates, etc), i.e. everything about the file except the data itself. This level
manages the directory structure and the mapping of file names to file control blocks,
FCBs, which contain all of the meta data as well as block number information for
finding the data on the disk.
 The layered approach to file systems means that much of the code can be used uniformly for a wide
variety of different file systems, and only certain layers need to be filesystem specific.
 When a layered structure is used for file-system implementation, duplication of code is minimized. The
I/O control and sometimes the basic file-system code can be used by multiple file systems.
 Common file systems in use include the UNIX file system, UFS, the Berkeley Fast File System, FFS,
Windows systems FAT, FAT32, NTFS, CD-ROM systems ISO 9660, and for Linux the extended file
systems ext2 and ext3 .

File-System Implementation

On disk, the file system may contain information about how to boot an operating system stored
there, the total number of blocks, the number and location of free blocks, the directory structure,
and individual files.

File systems store several important data structures on the disk:

 A boot-control block, ( per volume) can contain information needed by the system to
boot an operating system from that volume. If the disk does not contain an operating
system, this block can be empty. It is typically the first block of a volume In UFS, it is
called the boot block; in NTFS, it is the partition boot sector.
 A volume control block, ( per volume ) contains volume (or partition) details, such as the
number of blocks in the partition, size of the blocks, freeblock count and free-block
pointers, and free FCB count and FCB pointers. In UFS, this is called a superblock; in
NTFS, it is stored in. the master file table.
o A directory structure ( per file system ), containing file names and pointers to
corresponding FCBs. UNIX uses inode numbers, and NTFS uses a master file
table.
o TheFile Control Block, FCB, ( per file ) containing details about ownership, size,
permissions, dates, etc. UNIX stores this information in inodes, and NTFS in the
master file table as a relational database structure.
 There are also several key data structures stored in memory:
o An in-memory mount table contains information about each mounted volume..
o An in-memory directory cache of recently accessed directory information.
o A system-wide open file table, containing a copy of the FCB for every currently
open file in the system, as well as some other related information.
o A per-process open file table, containing a pointer to the system open file table as
well as some other information. ( For example the current file position pointer
may be either here or in the system file table, depending on the implementation
and whether the file is being shared or not. )

 Figure 11.3 illustrates some of the interactions of file system components when files are
created and/or used:
o When a new file is created, a new FCB is allocated and filled out with important
information regarding the new file.
o When a file is accessed during a program, the open( ) system call reads in the
FCB information from disk, and stores it in the system-wide open file table. An
entry is added to the per-process open file table referencing the system-wide table,
and an index into the per-process table is returned by the open( ) system call.
UNIX refers to this index as a file descriptor, and Windows refers to it as a file
handle.
o If another process already has a file open when a new request comes in for the
same file, and it is sharable, then a counter in the system-wide table is
incremented and the per-process table is adjusted to point to the existing entry in
the system-wide table.
o When a file is closed, the per-process table entry is freed, and the counter in the
system-wide table is decremented. If that counter reaches zero, then the system
wide table is also freed. Any data currently stored in memory cache for this file is
written out to disk if necessary.
Partitions and Mounting

 Physical disks are commonly divided into smaller units called partitions. They canalso be
combined into larger units, but that is most commonly done for RAID installations and is left
for later chapters.
 Partitions can either be used as raw devices ( with no structure imposed upon them ), or
"cooked;' containing a file system. they can be formatted to hold a filesystem ( i.e. populated
with FCBs and initial directory structures as appropriate. ) Raw partitions are generally used
for swap space, and may also be used for certain programs such as databases that choose to
manage their own disk storage system. Partitions containing filesystems can generally only be
accessed using the file system structure by ordinary users, but can often be accessed as a raw
device also by root.
 Boot information can be stored in a separate partition. Again, it has its own format, because at
boot time the system does not have file-system device drivers loaded and therefore cannot
interpret the file-system format.
 The boot block is accessed as part of a raw partition, by the boot program prior to any
operating system being loaded. Modern boot programs understand multiple OSes and
filesystem formats, and can give the user a choice of which of several available systems to
boot.
 The root partition contains the OS kernel and at least the key portions of the OS needed to
complete the boot process. At boot time the root partition is mounted, and control is
transferred from the boot program to the kernel found there. ( Older systems required that the
root partition lie completely within the first 1024 cylinders of the disk, because that was as
far as the boot program could reach.
Once the kernel had control, then it could access partitions beyond the 1024 cylinder boundary. )
 Continuing with the boot process, additional filesystems get mounted, adding their information into
the appropriate mount table structure. As a part of the mounting process the file systems may be
checked for errors or inconsistencies, either because they are flagged as not having been closed
properly the last time they were used, or just for general principals. Filesystems may be mounted either
automatically or manually. In UNIX a mount point is indicated by setting a flag in the in-memory copy
of the inode, so all future references to that inode get re- directed to the root directory of the mounted
filesystem.

Virtual File Systems

 Virtual File Systems, VFS, provide a common interface to multiple different filesystem types.
In addition, it provides for a unique identifier ( vnode ) for files across the entire space,
including across all filesystems of different types. (UNIX inodes are unique only across a
single filesystem, and certainly do not carry across networked file systems.)
 The VFS in Linux is based upon four key object types:
o The inode object, representing an individual file
o The file object, representing an open file.
o The superblock object, representing a filesystem.
o The dentry object, representing an individual directory entry.
 Linux VFS provides a set of common functionalities for each filesystem, using function pointers
accessed through a table. The same functionality is accessed through the same table position for all
filesystem types, though the actual functions pointed to by the pointers may be filesystem-specific.
Common operations provided include open( ), read( ), write( ), and mmap( ).
Figure 11.4. The first layer is the file-system interface, based on the open(), read(), write(), and
close() calls and on file descriptors. The second layer is called the virtual file system (VFS) layer; it
serves two important functions:
1. It separates file-system-generic operations from their implementation by defining a clean
VFS interface. Several implementations for the VFS interface may coexist on the same
machine, allowing transparent access to different types of file systems mounted locally.
2. The VFS provides a mechanism for uniquely representing a file throughout a network.
The VFS is based on a file-representation structure, called a vnode, that contains a
numerical designator for a network-wide unique file. (UNIX inodes are unique within
only a single file system.) This network-wide uniqueness is required for support of
network file systems.
The kernel maintains one vnode structure for each active node (file or directory).

Directory Implementation

The selection of directory-allocation and directory-management algorithms significantly affects

the efficiency, performance, and reliability of the file system. Directories need to be fast to
search, insert, and delete, with a minimum of wasted disk space.

a) Linear List

A linear list is the simplest and easiest directory structure to set up, but it does have
some drawbacks.
The disadvantage of a linear list of directory entries is that finding a file requires a
linear search.
To overcome this, a software cache is implemented to store the recently accessed
directory structure.
Deletions can be done by moving all entries, flagging an entry as deleted, or by
moving the last entry into the newly vacant position.
A sorted list allows a binary search and decreases the average search time. However,
the requirement that the list be kept sorted may complicate creating and deleting files,
A linked list makes insertions and deletions into a sorted list easier, with overhead for
the links.
An advantage of the sorted list is that a sorted directory listing can be produced
without a separate sort step.

b) Hash Table

With this method, a linear list stores the directory entries, but a hash data structure is also
used.
The hash table takes a value computed from the file name and returns a pointer to the file
name in the linear list. Therefore, it can greatly decrease the directory search time.
Here collisions may occur. Collision is the situation where two file names hash to the
same location.
Alternatively, a chained-overflow hash table can be used. Each hash entry can be a linked
list instead of an individual value, and we can resolve collisions by adding the new entry
to the linked list.
The major disadvantage with a hash table are its generally fixed size and the dependence
of the hash function on that size. For example, assume that we make a linear-probing
hash table that holds 64 entries. The hash function converts file names into integers from
0 to 63, for instance, by using the remainder of a division by 64. If we later try to create a
65th file, we must enlarge the directory hash table—say, to 128 entries. As a result, we
need a new hash function that must map file names to the range 0 to 127, and we must
reorganize the existing directory entries to reflect their new hash-function values.

Allocation Methods

The main problem is how to allocate space to these files so that disk space is utilized effectively
and files can be accessed quickly. Three major methods of allocating disk space are in wide use:
contiguous, linked, and indexed.

a) Contiguous Allocation

 Contiguous Allocation requires that all blocks of a file be kept together

contiguously.
 Performance is very fast, because reading successive blocks of the same file
generally requires no movement of the disk heads, or at most one small step to the
next adjacent cylinder.
 Storage allocation is done by using one of the algorithms(first fit, best fit, worst
fit).
 The allocation of blocks contiguous leads to external fragmentation.
 Problems can arise when files grow, or if the exact size of a file is unknown at
creation time:
o Over-estimation of the file's final size increases external fragmentation
and wastes disk space.
o Under-estimation may require that a file be moved or a process aborted if
the file grows beyond its originally allocated space.
o If a file grows slowly over a long time period and the total final space
must be allocated initially, then a lot of space becomes unusable before the
file fills the space.
 A variation is to allocate file space in large contiguous chunks,
called extents. When a file outgrows its original extent, then an additional one
block is allocated. A pointer points from last block of contiguous memory
allocation to the extended chunk.
b) Linked Allocation

 Disk files can be stored as linked lists, with the expense of the storage space
consumed by each link. ( E.g. a block may be 508 bytes instead of 512. )
 Linked allocation involves no external fragmentation, does not require pre-known
file sizes, and allows files to grow dynamically at any time.
 Unfortunately linked allocation is only efficient for sequential access files, as
random access requires starting at the beginning of the list for each new location
access.
 Allocating clusters of blocks reduces the space wasted by pointers, at the cost of
internal fragmentation.
 Another big problem with linked allocation is reliability if a pointer is lost or
damaged. Doubly linked lists provide some protection, at the cost of additional
overhead and wasted space.
The File Allocation Table, FAT, used by DOS is a variation of linked allocation, where all the links
are stored in a separate table at the beginning of the disk. The benefit of this approach is that the FAT
table can be cached in memory, greatly improving random access speeds.
c) Indexed Allocation

 Indexed Allocation combines all of the indexes(block numbers) for accessingeach file into
a common block ( for that file ).
 Each file will have a common block called the index block.
 Some disk space is wasted ( relative to linked lists or FAT tables ) because an entire index
block must be allocated for each file, regardless of how many data blocks the file contains.
This leads to questions of how big the index block should be, and how it should be
implemented. There are several approaches:
o Linked Scheme - An index block is one disk block, which can be read and
written in a single disk operation. The first index block contains some header
information, the first N block addresses, and if necessary a pointer to additional
linked index blocks.
o Multi-Level Index - The first index block contains a set of pointers to secondary
index blocks, which in turn contain pointers to the actual data blocks.
o Combined Scheme - This is the scheme used in UNIX inodes, in which the first
12 entries data block pointers are stored directly in the inode, and then singly,
doubly, and triply indirect pointers provide access to more data blocks as
needed. The advantage of this scheme is that for small files( files stored in less
than 12 blocks ), the data blocks are readily accessible ( up to 48K with 4K block
sizes ); files up to about 4144K ( using 4K blocks ) are accessible with only a
single indirect block ( which can be cached ), and huge files are still accessible
using a relatively small number of disk accesses ( larger in theory than can be
addressed by a 32-bit address, which is why some systems have moved to 64-bit
file pointers. )
Free-Space Management:

The space created after deleting the files can be reused. Another important aspect of disk
management is keeping track of free space in memory. The list which keeps track of free space
in memory is called the free-space list. To create a file, search the free-space list for the required
amount of space and allocate that space to the new file. This space is then removed from the
free-space list. When a file is deleted, its disk space is added to the free-space list. The free-space
list, is implemented in different ways as explained below.

a) Bit Vector

 Fast algorithms exist for quickly finding contiguous blocks of a given size
 One simple approach is to use a bit vector, in which each bit represents a disk
block, set to 1 if free or 0 if allocated.

For example, consider a disk where blocks 2,3,4,5,8,9, 10,11, 12, 13, 17and 18 are free, and
the rest of the blocks are allocated. The free-space bit map would be
0011110011111100011

 Easy to implement and also very efficient in finding the first free block or ‘n’
consecutive free blocks on the disk.
 The down side is that a 40GB disk requires over 5MB just to store the bitmap.

b) Linked List
 A linked list can also be used to keep track of all free blocks.
 Traversing the list and/or finding a contiguous block of a given size are not easy,
but fortunately are not frequently needed operations. Generally the system just
adds and removes single blocks from the beginning of the list.
 The FAT table keeps track of the free list as just one more linked list on the table.

c) Grouping

 A variation on linked list free lists. It stores the addresses of n free blocks in the
first free block. The first n-1 blocks are actually free. The last block contains the
addresses of another n free blocks, and so on.
 The address of a large number of free blocks can be found quickly.

d) Counting

 When there are multiple contiguous blocks of free space then the system can keep
track of the starting address of the group and the number of contiguous free
blocks.
 Rather than keeping al list of n free disk addresses, we can keep the address of
first free block and the number of free contiguous blocks that follow the first
block.
 Thus the overall space is shortened. It is similar to the extent method of allocating
blocks.
e) Space Maps ( New )

 Sun's ZFS file system was designed for huge numbers and sizes of files,
directories, and even file systems.
 The resulting data structures could be inefficient if not implemented carefully.
For example, freeing up a 1 GB file on a 1 TB file system could involve updating
thousands of blocks of free list bit maps if the file was spread across the disk.
 ZFS uses a combination of techniques, starting with dividing the disk up into (
hundreds of) metaslabs of a manageable size, each having their own space map.
 Free blocks are managed using the counting technique, but rather than write the
information to a table, it is recorded in a log-structured transaction record.
Adjacent free blocks are also coalesced into a larger single free block.
 An in-memory space map is constructed using a balanced tree data structure,
constructed from the log data.
 The combination of the in-memory tree and the on-disk log provide for very fast
and efficient management of these very large files and free blocks.

Rom Bist
No ratings yet
Rom Bist
51 pages
OS Unit - 5 Notes
100% (1)
OS Unit - 5 Notes
34 pages
How To Add Custom Field To Condition Table
100% (1)
How To Add Custom Field To Condition Table
4 pages
Unit - I Database Mangement Systems
No ratings yet
Unit - I Database Mangement Systems
12 pages
Project Work Class X - Web
100% (1)
Project Work Class X - Web
52 pages
Complete Unit of Java
No ratings yet
Complete Unit of Java
66 pages
New4TH Operating System
No ratings yet
New4TH Operating System
99 pages
Module 5
No ratings yet
Module 5
68 pages
File Management-1
No ratings yet
File Management-1
84 pages
Operating System
No ratings yet
Operating System
42 pages
OS CHAPTER-11 - File Management
No ratings yet
OS CHAPTER-11 - File Management
44 pages
9 File Systems
No ratings yet
9 File Systems
38 pages
Unit 6 - File Management Notes
No ratings yet
Unit 6 - File Management Notes
30 pages
Unit-V Os
No ratings yet
Unit-V Os
27 pages
UNIT 4 Os
No ratings yet
UNIT 4 Os
39 pages
OS Unit5
No ratings yet
OS Unit5
23 pages
Unit-4 Os Notes
No ratings yet
Unit-4 Os Notes
27 pages
Chapter 2 - File System Management
No ratings yet
Chapter 2 - File System Management
43 pages
Unit V
No ratings yet
Unit V
34 pages
Os-Unit Iv
No ratings yet
Os-Unit Iv
30 pages
Module 4 Linux
No ratings yet
Module 4 Linux
23 pages
File System
No ratings yet
File System
27 pages
Operating System Unit-5
No ratings yet
Operating System Unit-5
27 pages
7269IV - 5th Semester - Computer Science and Engineering
No ratings yet
7269IV - 5th Semester - Computer Science and Engineering
37 pages
Coos Unit V Part 1&2
No ratings yet
Coos Unit V Part 1&2
16 pages
File Management
No ratings yet
File Management
18 pages
Unit 5 Os
No ratings yet
Unit 5 Os
23 pages
OSY Chapter 6 SSP
No ratings yet
OSY Chapter 6 SSP
24 pages
Unit-Iv File Management
No ratings yet
Unit-Iv File Management
21 pages
Data File
No ratings yet
Data File
22 pages
File System Interface: Unit - 5
No ratings yet
File System Interface: Unit - 5
24 pages
OS Unit 3 Part 2
No ratings yet
OS Unit 3 Part 2
20 pages
Os Unit 4
No ratings yet
Os Unit 4
20 pages
File System New
No ratings yet
File System New
16 pages
Wa0024
No ratings yet
Wa0024
30 pages
Os Unit 5
No ratings yet
Os Unit 5
15 pages
File Concept
No ratings yet
File Concept
21 pages
Osy 6
No ratings yet
Osy 6
19 pages
Module-5 - File System
No ratings yet
Module-5 - File System
16 pages
File Concept
No ratings yet
File Concept
14 pages
File Management
No ratings yet
File Management
26 pages
OS Unit-5
No ratings yet
OS Unit-5
20 pages
OSY Notes Vol 2 (6th Chapter) - Ur Engineering Friend
No ratings yet
OSY Notes Vol 2 (6th Chapter) - Ur Engineering Friend
23 pages
Os Module 5 Notes File System
No ratings yet
Os Module 5 Notes File System
17 pages
Unit-7 File System Interface Management
No ratings yet
Unit-7 File System Interface Management
15 pages
DoQuangHuy HE191197
No ratings yet
DoQuangHuy HE191197
8 pages
L-2.3.1 File System Management
No ratings yet
L-2.3.1 File System Management
8 pages
OSY Chapter 6
No ratings yet
OSY Chapter 6
12 pages
15 - Daily Class Notes System Software and Operating System
No ratings yet
15 - Daily Class Notes System Software and Operating System
7 pages
File-System Interface
No ratings yet
File-System Interface
47 pages
Whatisafile?: Attributes of The File
No ratings yet
Whatisafile?: Attributes of The File
15 pages
File System-1
No ratings yet
File System-1
11 pages
File System
No ratings yet
File System
8 pages
File System Management
No ratings yet
File System Management
9 pages
File System Interface Access Methods Directory Structure
No ratings yet
File System Interface Access Methods Directory Structure
27 pages
File 1. File Concept
No ratings yet
File 1. File Concept
6 pages
Mod 5 QB Soln
No ratings yet
Mod 5 QB Soln
5 pages
Os Lesson 3 File Management
No ratings yet
Os Lesson 3 File Management
9 pages
File System Print
No ratings yet
File System Print
9 pages
Chapter 5: File Systems
No ratings yet
Chapter 5: File Systems
15 pages
File Management
No ratings yet
File Management
4 pages
How To Use PuTTy and Commands
No ratings yet
How To Use PuTTy and Commands
11 pages
Forward Error Correction (FEC) : AMT 75 Series
No ratings yet
Forward Error Correction (FEC) : AMT 75 Series
4 pages
74 Series ICs
No ratings yet
74 Series ICs
13 pages
DW Questions
0% (1)
DW Questions
35 pages
DS. Unit I
No ratings yet
DS. Unit I
29 pages
Exam 4 Training Grile
No ratings yet
Exam 4 Training Grile
15 pages
Slides For Chapter 13: Name Services: Distributed Systems: Concepts and Design
No ratings yet
Slides For Chapter 13: Name Services: Distributed Systems: Concepts and Design
13 pages
AMD 64 BIOS and Kenrel Dev's Guide
No ratings yet
AMD 64 BIOS and Kenrel Dev's Guide
434 pages
BMS Ifbn640tle PDF
No ratings yet
BMS Ifbn640tle PDF
104 pages
InteliConfig Install Suite
No ratings yet
InteliConfig Install Suite
2 pages
Creating Custom Screen in Xd01
100% (1)
Creating Custom Screen in Xd01
9 pages
6234A MCT Checklist
No ratings yet
6234A MCT Checklist
7 pages
SQL - FlexBen System Requirements
No ratings yet
SQL - FlexBen System Requirements
24 pages
DXF Codes and AutoLISP Da
No ratings yet
DXF Codes and AutoLISP Da
4 pages
Normalization Notes
No ratings yet
Normalization Notes
20 pages
Index: S.No. Name of Practical Date
No ratings yet
Index: S.No. Name of Practical Date
16 pages
Online Random File Generator
No ratings yet
Online Random File Generator
2 pages
Record and Replay
No ratings yet
Record and Replay
38 pages
Renesas V850E2F Conexionados
No ratings yet
Renesas V850E2F Conexionados
22 pages
Git Quick Reference
100% (1)
Git Quick Reference
1 page
Lab 09 QPSK
No ratings yet
Lab 09 QPSK
4 pages
Laporan Praktikum Algoritma Dan Pemrograman: Pertemuan Ke - 5
No ratings yet
Laporan Praktikum Algoritma Dan Pemrograman: Pertemuan Ke - 5
13 pages
Dsa Midterm
No ratings yet
Dsa Midterm
5 pages
Job Description - Jasper Developer
No ratings yet
Job Description - Jasper Developer
1 page
Memory ERR CFX Solver
No ratings yet
Memory ERR CFX Solver
4 pages
Objectives: Classic Models
No ratings yet
Objectives: Classic Models
3 pages
C++ File Handling Step by Step: A Practical Guide with Examples
From Everand
C++ File Handling Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Python File Handling Made Easy: A Practical Guide with Examples
From Everand
Python File Handling Made Easy: A Practical Guide with Examples
William E. Clark
No ratings yet
Best Free Open Source Data Recovery Apps for Mac OS English Edition
From Everand
Best Free Open Source Data Recovery Apps for Mac OS English Edition
Cyber Jannah Sakura
No ratings yet

File Management Module-5

Uploaded by

File Management Module-5

Uploaded by

Implementation of File System

Directory and Disk Structure

File System Mounting

e) Internal File Structure

b) Other Access Methods(Indexed method)

The limitations of this structure is that -

 On a multi-user system, more information needs to be stored for each file:

Remote File Systems

a) The Client-Server Model

 When one computer system remotely mounts a filesystem that is physically

b) Distributed Information Systems

Examples of consistency semantics -

File systems store several important data structures on the disk:

Virtual File Systems

The selection of directory-allocation and directory-management algorithms significantly affects

 Contiguous Allocation requires that all blocks of a file be kept together

You might also like