Os Unit - 4
Os Unit - 4
UNIT – IV
STORAGE MANAGEMENT: File system-Concept of a file, access methods, directory structure,
file system mounting, file sharing, protection. (T1: Ch-10) SECONDARY-STORAGE
STRUCTURE: Overview of mass storage structure, disk structure, disk attachment, disk
scheduling algorithms, swap space management, stable storage implementation, and tertiary
storage structure (T1: Ch-12).
File Concept
The operating system abstracts from the physical properties of its storage devices to define a
logical storage unit, the file. Files are mapped by the operating system onto physical devices.
These storage devices are usually nonvolatile, so the contents are persistent between system
reboots.
A file is a named collection of related information that is recorded on secondary
storage. From a user’s perspective, a file is the smallest allotment of logical secondary storage.
Commonly, files represent programs (both source and object forms) and data. Data files may
be numeric, alphabetic, alphanumeric, or binary.
The information in a file is defined by its creator. Many different types of information
maybe stored in a file—source or executable programs, numeric or text data, photos, music,
video, and so on. A file has a certain defined structure, which depends on its type. A text file is
a sequence of characters organized into lines (and possibly pages). A source file is a
sequence of functions, each of which is further organized as declarations followed by
executable statements. An executable file is a series of code sections that the loader can
bring into memory and execute.
File Attributes
file’s attributes vary from one operating system to another but typically consist of these:
• Name. The symbolic file name is the only information kept in human readable form.
• Identifier. This unique tag, usually a number, identifies the file within the file system; it is
the non-human-readable name for the file.
• Type. This information is needed for systems that support different types of files.
• Location. This information is a pointer to a device and to the location of the file on that
device.
• Size. The current size of the file (in bytes, words, or blocks) and possibly the maximum
allowed size are included in this attribute.
• Protection. Access-control information determines who can do reading, writing, executing,
and so on.
• Time, date, and user identification. This information may be kept for creation, last
modification, and last use. These data can be useful for protection, security, and usage
monitoring.
File Operations
Creating a file.
Writing a file: The system must keep a write pointer to the location in the file where the
next write is to take place. The write pointer must be updated whenever a write occurs.
Reading a file: system needs to keep a read pointer to the location in the file where the next
read is to take place. the current operation location can be kept as a per-process current file-
position pointer.
Repositioning within a file: This file operation is also known as a file seek.
Deleting a file
Truncating a file
Most of the file operations mentioned involve searching the directory for the entry associated
with the named file. To avoid this constant searching, many systems require that an open()
system call be made before a file is first used. The operating system keeps a table, called the
open-file table, containing information about all open files.
Typically, the open-file table also has an open count associated with each file to
indicate how many processes have the file open. Each close() decreases this open count, and
when the open count reaches zero, the file is no longer in use, and the file’s entry is removed
from the open-file table.
Some operating systems provide facilities for locking an open file (or sections of a file). File
locks allow one process to lock a file and prevent other processes from gaining access to it. A
shared lock is akin to a reader lock in that several processes can acquire the lock
concurrently. An exclusive lock behaves like a writer lock; only one process at a time can
acquire such a lock.
Access Methods
Files store information. When it is used, this information must be accessed and read into
computer memory. The information in the file can be accessed in several ways. Some systems
provide only one access method for files. while others support many access methods, and
choosing the right one for a particular application is a major design problem.
There are three ways to access a file into computer system: Sequential Access, Direct Access,
Index sequential Method.
Sequential Access
The simplest access method is sequential access. Information in the file is processed in
order, one record after the other. Reads and writes make up the bulk of the operations on a
file. A read operation—read next()—reads the next portion of the file and automatically
advances a file pointer, which tracks the I/O location. Similarly, the write operation—write
next()—appends to the end of the file and advances to the end of the newly written material
(the new end of file).
Key points:
1. Data is accessed one record right after another record in an order.
2. When we use read command, it move ahead pointer by one
3. When we use write command, it will allocate memory and move the pointer to
the end of the file
4. Such a method is reasonable for tape.
Direct Access
Another method is direct access (or relative access). Here, a file is made up of fixed-length
logical records that allow programs to read and write records rapidly in no particular order.
The direct-access method is based on a disk model of a file, since disks allow random access
to any file block.
For direct access, the file is viewed as a numbered sequence of blocks or records. Thus,
we may read block 14, then read block 53, and then write block 7. There are no restrictions
on the order of reading or writing for a direct-access file. For the direct-access method, the
file operations must be modified to include the block number as a parameter. Thus, we have
read(n), where n is the block number, rather than read next(), and write(n) rather than write
next().
The block number provided by the user to the operating system is normally a relative
block number. A relative block number is an index relative to the beginning of the file. Thus,
the first relative block of the file is 0, the next is1, and so on. When file is used, information is
read and accessed into computer memory and there are several ways to accesses these
information of the file.
Partitioning is useful for limiting the sizes of individual file systems, putting multiple
file-system types on the same device, or leaving part of the device available for other uses,
such as swap space or unformatted (raw) disk space. A file system can be created on each of
these parts of the disk. Any entity containing a file system is generally known as a volume.
The volume may be a subset of a device, a whole device, or multiple devices linked together
into a RAID set. Each volume can be thought of as a virtual disk. Volumes can also store
multiple operating systems, allowing a system to boot and run more than one operating
system.
Each volume that contains a file system must also contain information about the files
in the system. This information is kept in entries in a device directory or volume table of
contents. The device directory (more commonly known simply as the directory) records
information—such as name, location, size, and type—for all files on that volume. Figure 11.7
shows a typical file-system organization.
Directory Overview
The directory can be viewed as a symbol table that translates file names into their
directory entries. If we take such a view, we see that the directory itself can be organized in
many ways. The organization must allow us to insert entries, to delete entries, to search for a
named entry, and to list all the entries in the directory.
What is a directory?
Directory can be defined as the listing of the related files on the disk. The directory may store
some or the entire file attributes. To get the benefit of different file systems on the different
operating systems, A hard disk can be divided into the number of partitions of different sizes.
The partitions are also called volumes or mini disks.
Each partition must have at least one directory in which, all the files of the partition
can be listed. A directory entry is maintained for each file in the directory which stores all the
information related to that file.
A directory can be viewed as a file which contains the Meta data of the bunch of files.
Disadvantages
1. We cannot have two files with the same name.
2. The directory may be very big therefore searching for a file may take so much time.
3. Protection cannot be implemented for multiple users.
4. There are no ways to group same kind of files.
5. Choosing the unique name for every file is a bit complex and limits the number of files
in the system because most of the Operating System limits the number of characters
used to construct the file name.
In the two-level directory structure, each user has his own user file directory (UFD).
The UFDs have similar structures, but each lists only the files of a single user. When a user job
starts or a user logs in, the system’s master file directory (MFD) is searched. The MFD is
indexed by user name or account number, and each entry points to the UFD for that user.
Every file in the system has a path name. To name a file uniquely, a user must know the path
name of the file desired.
▪ There are two ways to specify a file path:
Absolute Path
▪ In this path we can reach to a specified file from the main or root directory.
▪ In this case current directory is not involved; file path is specified starting from the
root directory.
Relative Path
▪ The user working in any directory that directory is called current directory.
▪ To reach to a specified file we have to search from the current directory.
Each user has its own directory and it cannot enter in the other user's directory.
However, the user has the permission to read the root's data but he cannot write or modify
this. Only administrator of the system has the complete access of root directory.
Searching is more efficient in this directory structure. The concept of current working
directory is used. A file can be accessed by two types of path, either relative or absolute. In
tree structured directory systems, the user is given the privilege to create the files as well as
directories.
These kinds of directory graphs can be made using links or aliases. We can have
multiple paths for a same file. Links can either be symbolic (logical) or hard link (physical).
If a file gets deleted in acyclic graph structured directory system, then
1. In the case of soft link, the file just gets deleted and we are left with a dangling pointer.
2. In the case of hard link, the actual file will be deleted only if all the references to it gets
deleted.
File Systems
File system is the part of the operating system which is responsible for file management. It
provides a mechanism to store the data and access to the file contents including data and
programs. Some Operating systems treats everything as a file for example Ubuntu.
The File system takes care of the following issues
o File Structure
We have seen various data structures in which the file can be stored. The task of the
file system is to maintain an optimal file structure.
o Recovering Free space
Whenever a file gets deleted from the hard disk, there is a free space created in the
disk. There can be many such spaces which need to be recovered in order to reallocate
them to other files.
o disk space assignment to the files
The major concern about the file is deciding where to store the files on the hard disk.
o tracking data location
A File may or may not be stored within only one block. It can be stored in the non
contiguous blocks on the disk. We need to keep track of all the blocks on which the
part of the files reside.
File-System Mounting
• The basic idea behind mounting file systems is to combine multiple file systems into
one large tree structure.
• The mount command is given a file system to mount and a mount point ( directory )
on which to attach it.
• Once a file system is mounted onto a mount point, any further references to that
directory actually refer to the root of the mounted file system.
• Any files ( or sub-directories ) that had been stored in the mount point directory prior
to mounting the new file system are now hidden by the mounted file system, and are
no longer available. For this reason some systems only allow mounting onto empty
directories.
• File systems can only be mounted by root, unless root has previously configured
certain file systems to be mountable onto certain pre-determined mount points. ( E.g.
root may allow users to mount floppy file systems to /mnt or something like it. )
Anyone can run the mount command to see what file systems are currently mounted.
• File systems may be mounted read-only, or have other restrictions imposed.
Figure 11.14 - File system. (a) Existing system. (b) Unmounted volume.
File Sharing
Multiple Users
• On a multi-user system, more information needs to be stored for each file:
o The owner ( user ) who owns the file, and who can control its access.
o The group of other user IDs that may have some special access to the file.
o What access rights are afforded to the owner ( User ), the Group, and to the rest
of the world ( the universe, a.k.a. Others. )
The indexed allocation method is the solution to the problem of both contiguous and
linked allocation. This is done by bringing all the pointers together into one location
called the index block. Of course, the index block will occupy some space and thus could
be considered as an overhead of the method. In indexed allocation, each file has its own
index block, which is an array of disk sector of addresses. The ith entry in the index block
points to the ith sector of the file. The directory contains the address of the index block of a
file. To read the ith sector of the file, the pointer in the ith index block entry is read to find
the desired sector. Indexed allocation supports direct access, without suffering from external
fragmentation. Any free block anywhere on the disk may satisfy a request for more space.
Task File management is a big problem is operating system. How it will be resolved?
8.10.1 Bit-Vector
Frequently, the free-space list is implemented as a bit map or bit vector. Each block is represented
by a 1 bit. If the block is free, the bit is 0; if the block is allocated, the bit is 1.
Notes
Example: Consider a disk where blocks 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 17, 18, 25, 26, and 27 are
free, and the rest of the blocks are allocated. The free-space bit map would be:
11000011000000111001111110001111…
The main advantage of this approach is that it is relatively simple and efficient to find n consecutive
free blocks on the disk. Unfortunately, bit vectors are inefficient unless the entire vector is kept
in memory for most accesses. Keeping it main memory is possible for smaller disks such as on
microcomputers, but not for larger ones.
0 1 2 n-1
...
0 block[i] free
bit[i] =
1 block[i] occupied
Another approach is to link all the free disk blocks together, keeping a pointer to the first free
block. This block contains a pointer to the next free disk block, and so on. In the previous example,
a pointer could be kept to block 2, as the first free block. Block 2 would contain a pointer to block
3, which would point to block 4, which would point to block 5, which would point to block 8, and
so on. This scheme is not efficient; to traverse the list, each block must be read, which requires
substantial I/O time.
Linked Scheme
directory
file first index block
jeep 19
data data
data
8.10.3 Grouping
A modification of the free-list approach is to store the addresses of n free blocks in the first free
block. The first n-1 of these are actually free. The last one is the disk address of another block
containing addresses of another n free blocks. The importance of this implementation is that Notes
addresses of a large number of free blocks can be found quickly.
8.10.4 Counting
Another approach is to take advantage of the fact that, generally, several contiguous blocks may
be allocated or freed simultaneously, particularly when contiguous allocation is used. Thus,
rather than keeping a list of free disk addresses, the address of the first free block is kept and the
number n of free contiguous blocks that follow the first block. Each entry in the free-space list
then consists of a disk address and a count. Although each entry requires more space than would
Notes a simple disk address, the overall list will be shorter, as long as the count is generally greater
than 1.
Data Segment
0 1 2 3 4 5 6
O/S Oracle File Segment Data Data Data Data
Header Header Header
Free Free
List List
1 2
X Y
Instance Instance
8.12 Summary
File is a named collection of data stored in a device.
File manager is an integral part of the operating system which is responsible for the
maintenance of secondary storage.
File system is a set of abstract data types that are implemented for the storage, hierarchical
organization, manipulation, navigation, access, and retrieval of data.
Disk file system is a file system designed for the storage of files on a data storage device,
most commonly a disk drive, which might be directly or indirectly connected to the
computer.
Flash file system is a file system designed for storing files on flash memory devices. Network
file system is a file system that acts as a client for a remote file access protocol, providing
access to files on a server.