Os Unit 4
Os Unit 4
File Systems:
Computer users store programs and data in files so that they can be used conveniently and
preserved across computing sessions. A user has many expectations when working with files,
namely
The resources used for storing and accessing files are I/O devices. As it must, the OS ensures
both efficient performance of file processing activities in processes and efficient use of I/O
devices.
Operating systems organize file management into two components called the file system and
the input-output control system (IOCS) to separate the file-level concerns from concerns
related to efficient storage and access of data. Accordingly, a file system provides facilities for
creating and manipulating files, for ensuring reliability of files when faults such as power
outages or I/O device mal functions occur, and for specifying how files are to be shared among
users. The IOCS provides access to data stored on I/O devices and good performance of I/O
devices.
We use the term file processing to describe the general sequence of operations of opening a
file, reading data from the file or writing data into it, and closing the file. Figure 6.1 shows the
arrangement through which an OS implements file processing activities of processes. Each
directory contains entries describing some files. The directory entry of a file indicates the name
of its owner, its location on a disk, the way its data is organized, and which users may access it
in what manner.
The code of a process Pi is shown in the left part of Figure 6.1. When it opens a file for
processing, the file system locates the file through the directory
structure, which is an arrangement of many directories. In Figure 6.1, there are two files named
beta located in different directories. When process Pi opens beta, the manner in which it names
beta, the directory structure, and identities of the user who initiated process Pi will together
determine which of the two files will be accessed.
A file system provides several file types. Each file type provides its own abstract view of
data in a file— it is logical view of data. Figure 6.1 shows that file beta opened by process Pi
has a record-oriented logical view, while file phi has a byte stream–oriented logical view in
which distinct records do not exist.
Figure 6.1 File systems and the IOCS.
The IOCS organizes a file‘s data on an I/O device in accordance with its file type. It is the
physical view of the file‘s data. The mapping between the logical view of the file‘s data and its
physical view is performed by the IOCS. The IOCS also provides an arrangement that speeds
up a file processing activity—it holds some data from a file in memory areas organized as
buffers, a file cache, or a disk cache. When a process performs a read operation to get some
data from a file, the IOCS takes the data from a buffer or a cache if it is present there. This way,
the process does not have to wait until the data is read off the I/O device that holds the file.
Analogously, when a process performs a write operation on a file, the IOCS copies the data to
be written in a buffer or in a cache. The actual I/O operations to read data from an I/O device
into a buffer or a cache, or to write it from there onto an I/O device, are performed by the IOCS
in the background.
6.1.1 File System and the IOCS
A file system views a file as a collection of data that is owned by a user, can be shared by a set
of authorized users, and has to be reliably stored over an extended period of time. A file system
gives users freedom in naming their files, as an aspect of ownership, so that a user can give a
desired name to a file without worrying whether it conflicts with names of other users‘ files;
and it provides privacy by protecting against interference by other users. The IOCS, on the
other hand, views a file as a repository of data that need to be accessed speedily and are stored
on an I/O device that needs to be used efficiently.
Table 6.1 summarizes the facilities provided by the file system and the IOCS. The file system
provides directory structures that enable users to organize their data into logical groups of
files, e.g., one group of files for each professional activity. The file system provides protection
against illegal file accesses and ensures correct operation when processes access and update a
file concurrently. It also ensures that data is reliably stored, i.e., data is not lost when system
crashes occur. Facilities of the IOCS are as described earlier.
The file system and the IOCS form a hierarchy. Each of them has policies and provides
mechanisms to implement the policies.
Table 6.1 Facilities Provided by the File System and the Input-Output Control System
File System
• Directory structures for convenient grouping of files
• Protection of files against illegal accesses
• File sharing semantics for concurrent accesses to a file
• Reliable storage of files
Data and Metadata A file system houses two kinds of data—data contained within files, and
data used to access files. We call the data within files file data, or simply data. The data used
to access files is called control data, or metadata. In the logical view shown in Figure 6.1, data
contained in the directory structure is metadata.
File Types A file system houses and organizes different types of files, e.g., data files, executable
programs, object modules, textual information, documents, spreadsheets, photos, and video
clips. Each of these file types has its own format for recording the data. These file types can be
grouped into two classes:
• Structured files
• Byte stream files
A structured file is a collection of records, where a record is a meaningful unit for processing
of data. A record is a collection of fields, and a field contains a single data item. Each record
in a file is assumed to contain a key field. The value in the key field of a record is unique in a
file; i.e., no two records contain an identical key.
Many file types mentioned earlier are structured files. File types used by standard system
software like compilers and linkers have a structure determined by the OS designer, while file
types of user files depend on the applications or programs that create them.
A byte stream file is ―flat.‖ There are no records and fields in it; it is looked upon as a sequence
of bytes by the processes that use it. The next example illustrates structured and byte stream
files.
Figure 6.2(a) shows a structured file named employee_info. Each record in the file contains
information about one employee. A record contains four fields: employee id, name,
designation, and age. The field containing the employee id is the key field. Figure 6.3(b) shows
a byte stream file report.
Figure 6.2 Logical views of (a) a structured file employee_info; (b) a byte stream file report.
File Attributes A file attribute is a characteristic of a file that is important either to its users or
to the file system, or both. Commonly used attributes of a file are: type, organization, size,
location on disk, access control information, which indicates the manner in which different
users can access the file; owner name, time of creation, and time of last use. The file system
stores the attributes of a file in its directory entry. During a file processing activity, the file
system uses the attributes of a file to locate it, and to ensure that each operation being performed
on it is consistent with its attributes. At the end of the file processing activity, the file system
stores changed values of the file‘s attributes, if any, in the file‘s directory entry.
Table 6.2 Operations on Files
6.3 FUNDAMENTAL FILE ORGANIZATIONS AND ACCESS METHODS
The term record access pattern to describe the order in which records in a file are accessed by
a process. The two fundamental record access patterns are sequential access, in which records
are accessed in the order in which they fall in a file (or in the reverse of that order), and random
access, in which records may be accessed in any order. The file processing actions of a process
will execute efficiently only if the process‘s record access pattern can be
implemented efficiently in the file system. The characteristics of an I/O device make it
suitable for a specific record access pattern. For example, a tape drive can access only the
record that is placed immediately before or after the current position of its read/write head.
Hence it is suitable for sequential access to records. A disk drive can directly access any record
given its address. Hence it can efficiently implement both the sequential and random record
access patterns.
A file organization is a combination of two features—a method of arranging records in a file
and a procedure for accessing them. A file organization is designed to exploit the characteristics
of an I/O device for providing efficient record access for a specific record access pattern. A file
system supports several file organizations so that a process can employ the one that best suits
its file processing requirements and the I/O device in use. This section describes three
fundamental file organizations—sequential file organization, direct file organization and index
sequential file organization. Other file organizations used in practice are either variants of these
fundamental ones or are special-purpose organizations that exploit less commonly used I/O
devices. Accesses to files governed by a specific file organization are implemented by an IOCS
module called an access method. An access method is a policy module of the IOCS. While
compiling a program, the compiler infers the file organization governing a file from the file‘s
declaration statement (or from the rules for default, if the program does not contain a file
declaration statement), and identifies the correct access method to invoke for operations on the
file.
The direct file organization provides convenience and efficiency of file processing when
records are accessed in a random order. To access a record, a read/write command needs to
mention the value in its key field. We refer to such files as direct-access files. A direct-access
file is implemented as follows: When a process provides the key value of a record to be
accessed, the access method module for the direct file organization applies a transformation to
the key value that generates the address of the record in the storage medium. If the file is
organized on a disk, the transformation generates a (track_no, record_no) address. The disk
heads are now positioned on the track track_no before a read or write command is issued on
the record record_no.
Direct file organization provides access efficiency when records are processed randomly.
However, it has three drawbacks compared to sequential file organization:
• Record address calculation consumes CPU time.
• Disks can store much more data along the outermost track than along the innermost track.
However, the direct file organization stores an equal amount of data along each track. Hence
some recording capacity is wasted.
• The address calculation formulas work correctly only if a record exists for every possible
value of the key, so dummy records have to exist for keys that are not in use. This requirement
leads to poor utilization of the I/O medium. Hence sequential processing of records in a direct-
access file is less efficient than processing of records in a sequential-access file.
Figure 6.3 shows the arrangement of employee records in sequential and direct file
organizations. Employees with the employee numbers 3, 5–9 and 11 have left the
organization. However, the direct-access file needs to contain a record for each of these
employees to satisfy the address calculation formulas. This fact leads to the need for dummy
records in the direct access file.
Figure 6.5 illustrates a file of employee information organized as an index sequential file.
Records are stored in ascending order by the key field. Two indexes are built to facilitate speedy
search. The track index indicates the smallest and largest key value located on each track (see
the fields named low and high in Figure 6.5). The higher-level index contains entries for groups
of tracks containing 3 tracks each. To locate the record with a key k, first the higher-level index
is searched to locate the group of tracks that may contain The desired record. The track index
for the tracks of the group is now searched to locate the track that may contain the desired
record, and the selected track is searched sequentially for the record with key k. The search
ends unsuccessfully if it fails to find the record on the track.
Figure 6.5 Track index and higher-level index in an index sequential file.
An access method is a module of the IOCS that implements accesses to a class of files using a
specific file organization. The procedure to be used for accessing records in a file, whether by
a sequential search or by address calculation, is determined by the file organization. The access
method module uses this procedure to access records. It may also use some advanced
techniques in I/O programming to make file processing more efficient. Two such techniques
are buffering and blocking of records.
Buffering of Records The access method reads records of an input file ahead of the time when
they are needed by a process and holds them temporarily in memory areas called buffers until
they are actually used by the process. The purpose of buffering is to reduce or eliminate the
wait for an I/O operation to complete; the process faces a wait only when the required record
does not already exist in a buffer. The converse actions are performed for an output file. When
the process performs a write operation, the data to be written into the file is copied into a buffer
and the process is allowed to continue its operation. The data is written on the I/O device
sometime later and the buffer is released for reuse. The process faces a wait only if a buffer is
not available when it performs a write operation.
Blocking of Records The access method always reads or writes a large block of data, which
contains several file records, from or to the I/O medium. This feature reduces the total number
of I/O operations required for processing a file, thereby improving the file processing efficiency
of a process. Blocking also improves utilization of an I/O medium and throughput of a device.
6.4 DIRECTORIES
A directory contains information about a group of files. Each entry in a directory contains the
attributes of one file, such as its type, organization, size, location, and the manner inwhich it
may be accessed by various users in the system. Figure 6.6 shows the fields of a typical
directory entry. The open count and lock fields are used when several processes open a file
concurrently. The open count indicates the number of such processes. As long as this count is
nonzero, the file system keeps some of the metadata concerning the file in memory to speed up
accesses to the data in the file. The lock field is used when a process desires exclusive access
to a file. The flags field is used to differentiate between different kinds of directory entries. We
put the value ―D‖ in this field to indicate that a file is a directory, ―L‖ to indicate that it is a
link, and ―M‖ to indicate that it is a mounted file system. Later sections in this chapter will
describe these uses. The misc info field contains information such as the file‘s owner, its time
of creation, and last modification.
A file system houses files owned by several users. Therefore it needs to grant users two
important prerogatives:
• File naming freedom: A user‘s ability to give any desired name to a file, without being
constrained by filenames chosen by other user
• File sharing: A user‘s ability to access files created by other users, and ability to permit
other users to access his files.
The file system creates several directories and uses a directory structure to organize them for
providing file naming freedom and file sharing.
Figure 6.7 shows a simple directory structure containing two kinds of directories. A user
directory (UD) contains entries describing the files owned by one user. The master directory
contains information about the UDs of all registered users of the system; each entry in the
master directory is an ordered pair consisting of a user id and a pointer to a UD. In the file
system shown, users A and B have each created their own file named beta. These files have
entries in the users respective UDs.
Use of separate UDs is what provides naming freedom. When a process created by user A
executes the statement open (beta, ...), the file system searches the master directory to locate
A‘s UD, and searches for beta in it. If the call open(beta, ...) had instead been executed by
some process created by B, the file system would have searched B‘s UD for beta. This
arrangement ensures that the correct file is accessed even if many files with identical names
exist in the system.
Use of UDs has one drawback, however. It inhibits users from sharing their files with other
users. A special syntax may have to be provided to enable a user to refer to another user‘s file.
For example, a process created by user C may execute the statement open (A→beta, ...) to open
A‘s file beta. The file system can implement this simply by using A‘s UD, rather than C‘s UD,
to search and locate file beta. To implement file protection, the file system must determine
whether user C is permitted to open A‘s file beta. It checks the protection info field of beta‘s
directory entry for this purpose.
6.4.1 Directory Trees
The MULTICS file system of the 1960s contained features that allowed the user to create a
new directory, give it a name of his choice, and create files and other directories in it up to any
desired level. The resulting directory structure is a tree, it is called the directory tree. After
MULTICS, most file systems have provided directory trees.
At any time, a user is said to be ―in‖ some specific directory,which is called his current
directory.When the user wishes to open a file, the file name is searched for in this
directory.Whenever the user logs in, the OS puts him in his home directory; the home directory
is then the user‘s current directory. A user can change his current directory at any time through
a ―change directory‖ command.
A file‘s name may not be unique in the file system, so a user or a process uses a path name to
identify it in an unambiguous manner. A path name is a sequence of one or more path
components separated by a slash (/), where each path component is a reference through a
directory and the last path component is the name of the file. Path names for locating a file
from the current directory are called relative path names. Relative path names are often short
and convenient to use; however, they can be confusing because a file may have different
relative path names when accessed from different current directories. For example, in Figure
6.7, the file alpha has the simple relative path name alpha when accessed from current directory
A, whereas it has relative path names of the form ../alpha and ../../alphawhen accessed fromthe
directories projects and real_time, respectively. To facilitate use of relative path names, each
directory stores information about its own parent directory in the directory structure.
In a directory tree, each file except therootdirectory has exactly one parent directory. This
directory structure provides total separation of different users‘ files and complete file naming
freedom. However, it makes file sharing rather cumbersome.
A user wishing to access another user‘s files has to use a path name that involves two or more
directories. For example, in Figure 6.8, user B can access file beta using the path name
../A/projects/beta or ˜A/projects/beta.
Use of the tree structure leads to a fundamental asymmetry in the way different users can access
a shared file. The file will be located in some directory belonging to one of the users, who can
access it with a shorter path name than can other users. This problem can be solved by
organizing the directories in an acyclic graph structure. In this structure, a file can have many
parent directories, and so a shared file can be pointed to by directories of all users who have
access to it. Acyclic graph structures are implemented through links.
Links A link is a directed connection between two existing files in the directory structure. It
can be written as a triple (<from_ file_name>, <to_ file_name>, <link_name>),where<from_
file_name>is a directory and<to_ file_name>can be a directory or a file. Once a link is
established,<to_ file_name>can be accessed as if it were a file named
<link_name> in the directory <from_ file_name>. The fact that <link_name> is a link in the
directory <from_ file_name> is indicated by putting the value ―L‖ in its flags field. Example
6.4 illustrates how a link is set up.
Example 6.4 Link in a Directory Structure
Figure 6.8 shows the directory structure after user C creates a link using the command (˜C,
˜C/software/web_server, quest). The name of the link is quest. The link is made in the directory
˜C and it points to the file ˜C/software/web_server. This link permits ˜C/software/web_server
to be accessed by the name ˜C/quest.
An unlink command nullifies a link. Implementation of the link and unlink commands involves
manipulation of directories that contain the files <from_ file_name> and <to_
file_name>. Deadlocks may arise while link and unlink commands are implemented if
several processes issue these commands simultaneously.
6.5 FILE PROTECTION
A user would like to share a file with collaborators, but not with others. We call this requirement
controlled sharing of files. To implement it, the owner of a file specifies which users can access
the file in what manner. The file system stores this information in the protection info field of
the file‘s directory entry, and uses it to control access to the file.
Different methods of structuring the protection information of files. In this section, we assume
that a file‘s protection information is stored in the form of an access control list. Each element
of the access control list is an access control pair of the
form(<user_name>,<list_of_access_privileges>). When a process executed by some user X
tries to perform an operation <opn> on file alpha, the file system searches for the pair with
<user_name>= X, in the access control list of alpha and checks whether <opn> is consistent
with the <list_of_access_privileges>. If it is not, the attempt to access alpha fails. For example,
a write attempt by X will fail if the entry for user X in the access control list is (X, read), or if
the list does not contain an entry for X.
The size of a file‘s access control list depends on the number of users and the number of access
privileges defined in the system. To reduce the size of protection information, users can be
classified in some convenient manner and an access control pair can be specified for each user
class rather than for each individual user. Now an access control list has only as many pairs as
the number of user classes. For example, Unix specifies access privileges for three classes of
users— the file owner, users in the same group as the owner, and all other users of the system.
In most file systems, access privileges are of three kinds—read, write, and execute. A write
privilege permits existing data in the file to be modified and also permits new data to be added:
One can further differentiate between these two privileges by defining a new access privilege
called append; however, it would increase the size of the protection information. The execute
privilege permits a user to execute the program contained in a file. Access privileges have
different meanings for directory files. The read privilege for a directory file implies that one
can obtain a listing of the directory, while the write privilege for a directory implies that one
can create new files in the directory. The execute privilege for a directory permits an access to
be made through it—that is, it permits a file existing in the directory to be accessed. A user can
use the execute privilege of directories to make a part of his directory structure visible to other
users.
A disk may contain many file systems, each in its own partition of the disk. The file system
knows which partition a file belongs to, but the IOCS does not. Hence disk space allocation is
performed by the file system.
Early file systems adapted the contiguous memory allocation model by allocating a single
contiguous disk area to a file when it was created. This model was simple to implement. It also
provided data access efficiency by reducing disk head movement during sequential access to
data in a file. However, contiguous allocation of disk space led to external fragmentation.
Interestingly, it also suffered from internal fragmentation because the file system found it
prudent to allocate some extra disk space to allow for expansion of a file.
Contiguity of disk space also necessitated complicated arrangements to avoid use of bad disk
blocks: The file system identified bad disk blocks while formatting the disk and noted their
addresses .It then allocated substitute disk blocks for the bad ones and built a table showing
addresses of bad blocks and their substitutes. During a read/write operation, the IOCS
checked whether the disk block to be accessed was a bad block. If it was, it obtained the
address of the substitute disk block and accessed it. Modern file systems adapt the
noncontiguous memory allocation model to disk space allocation. In this approach, a chunk
of disk space is allocated on demand, i.e., when the file is created or when its size grows
because of an update operation. The file system has to address three issues for implementing
this approach:
• Managing free disk space: Keep track of free disk space and allocate from it when a file
requires a new disk block.
• Avoiding excessive disk head movement: Ensure that data in a file is not dispersed to different
parts of a disk, as it would cause excessive movement of the disk heads during file processing.
• Accessing file data: Maintain information about the disk space allocated to a file and use it
to find the disk block that contains required data.
The file system can maintain a free list of disk space and allocate from it when a file requires
a new disk block. Alternatively, it can use a table called the disk status map (DSM) to indicate
the status of disk blocks. The DSM has one entry for each disk block, which indicates whether
the disk block is free or has been allocated to a file. This information can be maintained in a
single bit, and so a DSM is also called a bit map. Figure 13.12 illustrates a DSM. A 1 in an
entry indicates that the corresponding disk block is allocated. The DSM is consulted every time
a new disk block has to be allocated to a file. To avoid dispersing file data to different parts of
a disk, file systems confine the disk space allocation for a file either to consecutive disk blocks,
which form an extent, also called a cluster, or consecutive cylinders in a disk, which form
cylinder groups .
Use of a disk status map, rather than a free list, has the advantage that it allows the file system
to readily pick disk blocks from an extent or cylinder group.
The two fundamental approaches to noncontiguous disk space allocation. They differ in the
manner they maintain information about disk space allocated to a file.
6.6.1 Linked Allocation
A file is represented by a linked list of disk blocks. Each disk block has two fields in it—data
and metadata. The data field contains the data written into the file, while the metadata field is
the link field, which contains the address of the next disk block allocated to the file. Figure
6.10 illustrates linked allocation. The location info field of the directory entry of file alpha
points to the first disk block of the file. Other blocks are accessed by following the pointers in
the list of disk blocks. The last disk block contains null information in its metadata field. Thus,
file alpha consists of disk blocks 3 and 2, while file beta consists of blocks 4, 5,
and 7. Free space on the disk is represented by a free list in which each free disk block
contains a pointer to the next free disk block.When a disk block is needed to store new data
added to a file, a disk block is taken off the free list and added to the file‘s list of disk blocks.
To delete a file, the file‘s list of disk blocks is simply added to the free list.
Linked allocation is simple to implement, and incurs a low allocation/ deallocation overhead.
It also supports sequential files quite efficiently. However, files with nonsequential
organization cannot be accessed efficiently. Reliability is also poor since corruption of the
metadata field in a disk block may lead to loss of data in the entire file. Similarly, operation of
the file system may be disrupted if a pointer in the free list is corrupted.
6.6.2 File Allocation Table (FAT) MS-DOS uses a variant of linked allocation that stores the
metadata separately from the file data. A file allocation table (FAT) of a disk is an array that
has one element corresponding to every disk block in the disk. For a disk block that is allocated
to a file, the corresponding FAT element contains the address of the next disk block. Thus the
disk block and its FAT element together form a pair that contains the same information as the
disk block in a classical linked allocation scheme.
The directory entry of a file contains the address of its first disk block. The FAT element
corresponding to this disk block contains the address of the second disk block, and so on. The
FAT element corresponding to the last disk block contains a special code to indicate that the
file ends on that disk block. Figure 6.11 illustrates the FAT for the disk of Figure 6.10. The file
alpha consists of disk blocks 3 and 2. Hence the directory entry of alpha contains 3. The FAT
entry for disk block 3 contains 2, and the FAT entry for disk block 2 indicates that the file ends
on that disk block. The file beta consists of blocks 4, 5, and 7. The FAT can also be used to
store free space information. The list of free disk blocks can be stored as if it were a file, and
the address of the first free disk block can be held in a free list pointer. Alternatively, some
special code can be stored in the FAT element corresponding to a free disk block, e.g. the code
―free‖ in Figure 6.11. Use of the FAT rather than the classical linked allocation involves a
performance penalty, since the FAT has to be accessed to obtain the address of the next disk
block. To overcome this problem, the FAT is held in memory during file processing. Use of the
FAT provides higher reliability than classical linked allocation because corruption of a disk
block containing file data leads to limited damage. However, corruption of a disk block used
to store the FAT is disastrous.
6.6.3 Indexed Allocation
In indexed allocation, an index called the file map table (FMT) is maintained to note the
addresses of disk blocks allocated to a file. In its simplest form, an FMT can be an array
containing disk block addresses. Each disk block contains a single field—the data field. The
location info field of a file‘s directory entry points to the FMT for the file.
In the following discussion we use the notation fmtalpha for the FMT of the file alpha. If the
size of the file alpha grows, the DSM is searched to locate a free block, and the address of the
block is added to fmt alpha. Deallocation is performed when alpha is deleted. All disk blocks
pointed to by fmtalpha are marked free before fmtalpha and the directory entry of alpha are
erased.
The reliability problem is less severe in indexed allocation than in linked allocation because
corruption of an entry in an FMT leads to only limited damage.
Compared with linked allocation, access to sequential-access files is less efficient because the
FMT of a file has to be accessed to obtain the address of the next disk block. However, access
to records in a direct-access file is more efficient since the address of the disk block that
contains a specific record can be obtained directly from the FMT. For example, if address
calculation anal shows that a required record exists in the ith disk block of a file, its address
can be obtained from the ith entry of the FMT. For a small file, the FMT can be stored in the
directory entry of the file; it is both convenient and efficient. For a medium or large file, the
FMT will not fit into the directory entry. A two-level indexed allocation depicted in Figure 6.13
may be used for such FMTs. In this organization, each entry of the FMT contains the address
of an index block. An index block does not contain data; it contains entries that contain
addresses of data blocks. To access the data block, we first access an entry of the FMT and
obtain the address of an index block.
data blocks as in the conventional indexed allocation. Other entries point to index blocks. The
advantage of this arrangement is that small files containing n or fewer data blocks continue to
be accessible efficiently, as their FMT does not use index blocks. Medium and large files suffer
a marginal degradation of their access performance because of multiple levels of indirection.
The Unix file system uses a variation of the hybrid FMT organization.
Two performance issues are associated with the use of a disk block as the unit of disk space
allocation—size of the metadata, i.e., the control data of the file system; and efficiency of
accessing file data. Both issues can be addressed by using a larger unit of allocation of disk
space. Hence modern file systems tend to use an extent, also called a cluster, as a unit of disk
space allocation. An extent is a set of consecutive disk blocks. Use of large extents provides
better access efficiency. However, it causes more internal fragmentation. To get the best of both
worlds, file systems prefer to use variable extent sizes. Their metadata contains the size of an
extent along with its address.
The file system uses the IOCS to perform I/O operations and the IOCS implements them
through kernel calls. The interface between the file system and the IOCS consists of three data
structures—the file map table (FMT), the file control block (FCB), and the open files table
(OFT)—and functions that perform I/O operations. Use of these data structures avoids repeated
processing of file attributes by the file system, and provides a convenient method of tracking
the status of ongoing file processing activities.
The file system allocates disk space to a file and stores information about the allocated disk
space in the file map table (FMT). The FMT is typically held in memory during the processing
of a file.
A file control block (FCB) contains all information concerning an ongoing file processing
activity. This information can be classified into the three categories shown in Table 6.3.
Information in the file organization category is either simply extracted from the file declaration
statement in an application program, or inferred from it by the compiler, e.g., information such
as the size of a record and number of buffers is extracted from a file declaration, while the
name of the access method is inferred from the type and organization of a file. The compiler
puts this information as parameters in the open call. When the call is made during execution of
the program, the file system puts this information in the FCB. Directory information is copied
into the FCB through joint actions of the file system and the IOCS when a new file is created.
Information concerning the current state of processing is written into the FCB by the IOCS.
This information is continually updated during the procssing of afile.The open filestable (OFT) holds
theFCBs of all open files. The OFT
resides in the kernel address space so that user processes cannot tamper with it. When a file is
opened, the file system stores its FCB in a new entry of the OFT. The offset of this entry in the
OFT is called the internal id of the file. The internal id is passed back to the process, which
uses it as a parameter in all future file system calls.
Figure 6.15 shows the arrangement set up when a file alpha is opened. The file system copies
fmtalpha in memory; creates fcbalpha, which is an FCB for alpha, in the OFT; initializes its
fields appropriately; and passes back its offset in OFT, which in this case is 6, to the process as
internal_idalpha.
Table 6.3 Fields in the File Control Block (FCB)
Figure 6.15 Interface between file system and IOCS—OFT, FCB and FMT.
The file system supports the following operations:
• open (<file_name>, <processing_mode>, <file_attributes>)
• close (<internal_id_of_file>)
<file_attributes> is a list of file attributes, such as the file‘s organization, record size, and
protection information. It is relevant only when a new file is being created—attributes from the
list are copied into the directory entry of the file at this time. <record_info> indicates the
identity of the recrdto beread or written if the Engineering file is be processed in a
nonsequential
mode. <I/O_area addr> indicates the address of the memory area where data from the record
should be read, or the memory area that contains the data to be written into the record.
The iocs-open and iocs-close operations are specialized read and write operations that copy
information into the FCB from the directory entry or from the FCB into the directory entry.
The iocs-read/write operations access the FCB to obtain information concerning the current
state of the file processing activity, such as the address of the next record to be processed. When
a write operation requires more disk space, iocs-write invokes a function of
the file system to perform disk space allocation.
Figure 6.16 is a schematic diagram of the processing of an existing file alpha in a process
executed by some user U. The compiler replaces the statements open, read, and close in the
source program with calls on the file system operations open, read, and close, respectively. The
following are the significant steps in file processing involving the file system and the IOCS,
shown by numbered arrows in Figure 6.16:
1. The process executes the call open (alpha, ‗read,‘ <file_attributes>). The call returns with
internal_idalpha if the processing mode ―read‖ is consistent with protection information of
the file. The process saves internal_idalpha for use while performing operations on file alpha.
2. The file system creates a new FCB in the open files table. It resolves the path name alpha,
locates the directory entry of alpha, and stores the information about it in the new FCB for use
while closing the file. Thus, the new FCB becomes fcbalpha. The file system now makes a call
iocs-open with internal_idalpha and the address of the directory entry of alpha as parameters.
3. The IOCS accesses the directory entry of alpha, and copies the file size and address of the
FMT, or the FMT itself, from the directory entry into fcbalpha.
4. When the process wishes to read a record of alpha into area xyz, it invokes the read operation
of the file system with internal_idalpha, <record_info>, and Ad(xyz) as parameters.
5. Information about the location of alpha is now available in fcbalpha. Hence the read/write
operations merely invoke iocs-read/write operations.
8. The IOCS obtains information about the directory entry of alpha from fcbalpha and copies
the file size and FMT address, or the FMT itself, from fcbalpha into the directory entry of alpha.
Introduction
File sharing is the practice of distributing or providing access to digital files between
two or more users or devices. While it is a convenient way to share information and
collaborate on projects, it also comes with risks such as malware and viruses, data
breaches, legal consequences, and identity theft. Protecting files during sharing is
essential to ensure confidentiality, integrity, and availability. Encryption, password
protection, secure file transfer protocols, and regularly updating antivirus and anti-
malware software are all important measures that can be taken to safeguard files.
This article will explore the different types of file sharing, risks associated with file
sharing, protection measures, and best practices for secure file sharing.
File sharing refers to the process of sharing or distributing electronic files such as
documents, music, videos, images, and software between two or more users or
computers.
File sharing plays a vital role in facilitating collaboration and communication among
individuals and organizations. It allows people to share files quickly and easily across
different locations, reducing the need for physical meetings and enabling remote
work. File sharing also helps individuals and organizations save time and money, as
it eliminates the need for physical transportation of files.
File sharing can pose several risks and challenges, including the spread of malware
and viruses, data breaches and leaks, legal consequences, and identity theft.
Unauthorized access to sensitive files can also result in loss of intellectual property,
financial losses, and reputational damage.
With the increase in cyber threats and the sensitive nature of the files being shared,
it is essential to implement adequate file protection measures to secure the files
from unauthorized access, theft, and cyberattacks. Effective file protection
measures can help prevent data breaches and other cyber incidents, safeguard
intellectual property, and maintain business continuity.
• Peer-to-Peer (P2P) File Sharing − Peer-to-peer file sharing allows users to share files with each
other without the need for a centralized server. Instead, users connect to each other directly and
exchange files through a network of peers. P2P file sharing is commonly used for sharing large files
such as movies, music, and software.
• Cloud-Based File Sharing − Cloud-based file sharing involves the storage of files in a remote
server, which can be accessed from any device with an internet connection. Users can upload and
download files from cloud-based file sharing services such as Google Drive, Dropbox, and OneDrive.
Cloud-based file sharing allows users to easily share files with others, collaborate on documents,
and access files from anywhere.
• Direct File Transfer − Direct file transfer involves the transfer of files between two devices
through a direct connection such as Bluetooth or Wi-Fi Direct. Direct file transfer is commonly used
for sharing files between mobile devices or laptops.
• Removable Media File Sharing − Removable media file sharing involves the use of physical
storage devices such as USB drives or external hard drives. Users can copy files onto the device and
share them with others by physically passing the device to them.
Each type of file sharing method comes with its own set of risks and challenges.
Peer-to-peer file sharing can expose users to malware and viruses, while cloud-
based file sharing can lead to data breaches if security measures are not
implemented properly. Direct file transfer and removable media file sharing can also
lead to data breaches if devices are lost or stolen.
To protect against these risks, users should take precautions such as using
encryption, password protection, secure file transfer protocols, and regularly
updating antivirus and antimalware software. It is also essential to educate users
on safe file sharing practices and limit access to files only to authorized individuals
or groups. By taking these steps, users can ensure that their files remain secure
and protected during file sharing.
• Malware and Viruses − One of the most significant risks of file sharing is the spread of malware
and viruses. Files obtained from untrusted sources, such as peer-to-peer (P2P) networks, can
contain malware that can infect the user's device and compromise the security of their files.
Malware and viruses can cause damage to the user's device, steal personal information, or even use
their device for illegal activities without their knowledge.
• Data Breaches and Leaks − Another significant risk of file sharing is the possibility of data
breaches and leaks. Cloud-based file sharing services and P2P networks are particularly vulnerable
to data breaches if security measures are not implemented properly. Data breaches can result in
the loss of sensitive information, such as personal data or intellectual property, which can have
severe consequences for both individuals and organizations.
• Legal Consequences − File sharing copyrighted material without permission can lead to legal
consequences. Sharing copyrighted music, movies, or software can result in copyright infringement
lawsuits and hefty fines.
• Identity Theft − File sharing can also expose users to identity theft. Personal information, such as
login credentials or social security numbers, can be inadvertently shared through file sharing if
security measures are not implemented properly. Cybercriminals can use this information to
commit identity theft, which can have severe consequences for the victim.
To protect against these risks, users should take precautions such as using trusted
sources for file sharing, limiting access to files, educating users on safe file sharing
practices, and regularly updating antivirus and anti-malware software. By taking
these steps, users can reduce the risk of malware and viruses, data breaches and
leaks, legal consequences, and identity theft during file sharing.