Module 4 Linux
Module 4 Linux
MODULE 4
File Systems
File systems are a crucial part of any operating system, providing a structured way to store,
organize, and manage data on storage devices such as hard drives, SSDs, and USB drives.
a file system acts as a bridge between the operating system and the physical storage hardware,
allowing users and applications to create, read, update, and delete files in an organized and
efficient manner.
A file system is a method an operating system uses to store, organize, and manage files and
directories on a storage device. Some common types of file systems include:
• FAT (File Allocation Table): An older file system used by older versions of Windows
and other operating systems.
• NTFS (New Technology File System): A modern file system used by Windows. It
supports features such as file and folder permissions, compression, and encryption.
• EXT (Extended File System): A file system commonly used on Linux and Unix-based
operating systems.
• APFS (Apple File System): A new file system introduced by Apple for their Macs and
iOS devices.
File Concepts:
1. Files Attributes:
Each file has characteristics like file name, file type, date (on which file was
created), etc. These characteristics are referred to as 'File Attributes'. The
operating system associates these attributes with files. In different operating
systems files may have different attributes. Some people call attributes metadata
also.
1. Name: File name is the name given to the file. A name is usually a string of
characters.
2. Identifier: Identifier is a unique number for a file. It identifies files within the
file system. It is not readable to us, unlike file names.
3. Type: Type is another attribute of a file which specifies the type of file such as
archive file (.zip), source code file (.c, .java), .docx file, .txt file, etc.
4. Location: Specifies the location of the file on the device (The directory path).
This attribute is a pointer to a device.
5. Size: Specifies the current size of the file (in Kb, Mb, Gb, etc.) and possibly the
maximum allowed size of the file.
6. Protection: Specifies information about Access control (Permissions about
Who can read, edit, write, and execute the file.) It provides security to sensitive
and private information.
7. Time, date, and user identification: This information tells us about the date
and time on which the file was created, last modified, created and modified by
which user, etc.
2. File Operations:
The operating system must do to perform basic file operations given below.
• Creating a file: Two steps are necessary to create a file. First, space in the file
system must be found for the file. Second, an entry for the new file must be
made in the directory.
• Writing a file: To write a file, we make a system call specifying both the name
of the file and the information to be written to the file. Given the name of the
file, the system searches the directory to find the file's location. The system
must keep a write pointer to the location in the file where the next write is to
take place. The write pointer must be updated whenever a write occurs.
• Reading a file: To read from a file, we use a system call that specifies the
name of the file and where (in memory) the next block of the file should be
put. Again, the directory is searched for the associated entry, and the system
needs to keep a read pointer to the location in the file where the next read is to
take place. Once the read has taken place, the read pointer is updated.
• Repositioning within a file: The directory is searched for the appropriate
entry, and the current-file-position pointer is repositioned to a given value.
Repositioning within a file need not involve any actual I/O. This file operation
is also known as a file seek.
• Deleting a file: To delete a file, we search the directory for the named file.
Having found the associated directory entry, we release all file space, so that
it can be reused bv other files, and erase the directory entry.
• Protection: Access-control information determines who can do reading,
writing, executing, and so on.
• Truncating a file: The user may want to erase the contents of a file but keep
its attributes. Rather than forcing the user to delete the file and then recreate
it, this function allows all attributes to remain unchanged—except for file
length—but lets the tile be reset to length zero and its file space released.
File System Structure
A File Structure should be according to a required format that the operating system can
understand.
• A file has a certain defined structure according to its type.
• A text file is a sequence of characters organized into lines.
• A source file is a sequence of procedures and functions.
• An object file is a sequence of bytes organized into blocks that are understandable by
the machine.
• When operating system defines different file structures, it also contains the code to
support these file structure. Unix, MS-DOS support minimum number of file
structure.
Files can be structured in several ways in which three common structures are given in this
tutorial with their short description one by one.
• It is a more efficient method for reading large files, as it only reads the required data
and does not waste time reading unnecessary data.
• It is a reliable method for backup and restore operations, as the data is stored
sequentially and can be easily restored if required.
Disadvantages of Sequential Access Method
• If the file record that needs to be accessed next is not present next to the current
record, this type of file access method is slow.
• Moving a sizable chunk of the file may be necessary to insert a new record.
• It does not allow for quick access to specific records in the file. The entire file must
be searched sequentially to find a specific record, which can be time-consuming.
• It is not well-suited for applications that require frequent updates or modifications to
the file. Updating or inserting a record in the middle of a large file can be a slow and
cumbersome process.
• Sequential access can also result in wasted storage space if records are of varying
lengths. The space between records cannot be used by other records, which can result
in inefficient use of storage.
Directory Structure
A directory is a container that is used to contain folders and files. It organizes files and folders
in a hierarchical manner. In other words, directories are like folders that help organize files
on a computer. Just like you use folders to keep your papers and documents in order, the
operating system uses directories to keep track of files and where they are stored. Different
structures of directories can be used to organize these files, making it easier to find and
manage them.
Advantages
1. Implementation is very simple.
2. If the sizes of the files are very small, then the searching becomes faster.
3. File creation, searching, deletion is very simple since we have only one directory.
Disadvantages
1. We cannot have two files with the same name.
2. The directory may be very big therefore searching for a file may take so much time.
3. Protection cannot be implemented for multiple users.
4. There are no ways to group same kind of files.
5. Choosing the unique name for every file is a bit complex and limits the number of
files in the system because most of the Operating System limits the number of
characters used to construct the file name.
2. Two Level Directory
In two level directory systems, we can create a separate directory for each user. There is
one master directory which contains separate directories dedicated to each user. For each
user, there is a different directory present at the second level, containing group of user's
files. The system does not let a user to enter in the other user's directory without permission.
Advantages
• The main advantage is there can be more than two files with same name, and would
be very helpful if there are multiple users.
• A security would be there which would prevent user to access other user’s files.
• Searching of the files becomes very easy in this directory structure.
Disadvantages
• As there is advantage of security, there is also disadvantage that the user cannot share
the file with the other users.
• Unlike the advantage users can create their own files, users don’t have the ability to
create subdirectories.
• Scalability is not possible because one user can’t group the same types of files
together.
Advantages
• This directory structure allows subdirectories inside a directory.
• The searching is easier.
• File sorting of important and unimportant becomes easier.
• This directory is more scalable than the other two directory structures explained.
Disadvantages
• As the user isn’t allowed to access other user’s directory, this prevents the file sharing
among users.
• As the user has the capability to make subdirectories, if the number of subdirectories
increase the searching may become complicated.
• Users cannot modify the root directory data.
• If files do not fit in one, they might have to be fit into other directories.
can provide sharing by making the directory an acyclic graph. In this system, two or
more directory entry can point to the same file or sub directory. That file or sub
directory is shared between the two directory entries.
These kinds of directory graphs can be made using links or aliases. We can have
multiple paths for a same file. Links can either be symbolic (logical) or hard link
(physical).
Advantages
• Sharing of files and directories is allowed between multiple users.
• Searching becomes too easy.
• Flexibility is increased as file sharing and editing access is there for multiple users.
Disadvantages
• Because of the complex structure it has, it is difficult to implement this directory
structure.
• The user must be very cautious to edit or even deletion of file as the file is accessed
by multiple users.
• If we need to delete the file, then we need to delete all the references of the file in
order to delete it permanently.
The allocation methods define how the files are stored in the disk blocks. There are three
main disk space or file allocation methods.
• Contiguous Allocation
• Linked Allocation
• Indexed Allocation
1. Continuous Allocation
A single continuous set of blocks is allocated to a file at the time of file creation. Thus, this is
a pre-allocation strategy, using variable size portions. The file allocation table needs just a
single entry for each file, showing the starting block and the length of the file. This method is
best from the point of view of the individual sequential file. Multiple blocks can be read in at
a time to improve I/O performance for sequential processing. It is also easy to retrieve a single
block.
For example, if a file starts at block b, and the ith block of the file is wanted, its location on
secondary storage is simply b+i-1.
Disadvantages of Continuous Allocation
• External fragmentation will occur, making it difficult to find contiguous blocks of
space of sufficient length. A compaction algorithm will be necessary to free up
additional space on the disk.
• Also, with pre-allocation, it is necessary to declare the size of the file at the time of
creation.
Any free block can be added to the chain. The blocks need not be continuous. An increase
in file size is always possible if a free disk block is available.
There is no external fragmentation because only one block at a time is needed but there
can be internal fragmentation but it exists only in the last disk block of the file.
3. Indexed Allocation
It addresses many of the problems of contiguous and chained allocation. In this case, the file
allocation table contains a separate one-level index for each file: The index has one entry for
each block allocated to the file.
The allocation may be based on fixed-size blocks or variable-sized blocks. Allocation by
blocks eliminates external fragmentation, whereas allocation by variable-size blocks
improves locality.
This allocation technique supports both sequential and direct access to the file and thus is the
most popular form of file allocation.
Disk scheduling algorithms are crucial in managing how data is read from and written to a
computer’s hard disk. These algorithms help determine the order in which disk read and write
requests are processed, significantly impacting the speed and efficiency of data access.
Common disk scheduling methods include First-Come, First-Served (FCFS), Shortest
Seek Time First (SSTF), SCAN, C-SCAN, LOOK, and C-LOOK.
• Two or more requests may be far from each other so this can result in greater disk
arm movement.
• Hard drives are one of the slowest parts of the computer system and thus need to be
accessed in an efficient manner.
Key Terms Associated with Disk Scheduling
• Seek Time: Seek time is the time taken to locate the disk arm to a specified track
where the data is to be read or written. So the disk scheduling algorithm that gives a
minimum average seek time is better.
• Rotational Latency: Rotational Latency is the time taken by the desired sector of the
disk to rotate into a position so that it can access the read/write heads. So the disk
scheduling algorithm that gives minimum rotational latency is better.
• Transfer Time: Transfer time is the time to transfer the data. It depends on the
rotating speed of the disk and the number of bytes to be transferred.
• Disk Access Time:
Disk Access Time = Seek Time + Rotational Latency + Transfer Time
Total Seek Time = Total head Movement * Seek Time
• Disk Response Time: Response Time is the average time spent by a request waiting
to perform its I/O operation. The average Response time is the response time of all
requests. Variance Response Time is the measure of how individual requests are
serviced with respect to average response time. So the disk scheduling algorithm that
gives minimum variance response time is better.
• C-LOOK
Advantages of FCFS
Here are some of the advantages of First Come First Serve.
• Every request gets a fair chance
• No indefinite postponement
Disadvantages of FCFS
Here are some of the disadvantages of First Come First Serve.
• Does not try to optimize seek time
3. SCAN
In the SCAN algorithm the disk arm moves in a particular direction and services the requests
coming in its path and after reaching the end of the disk, it reverses its direction and again
services the request arriving in its path. So, this algorithm works as an elevator and is hence
also known as an elevator algorithm. As a result, the requests at the midrange are serviced
more and those arriving behind the disk arm will have to wait.
Example:
Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm
is at 50, and it is also given that the disk arm should move “towards the larger value”.
Therefore, the total overhead movement (total distance covered by the disk arm) is calculated
as
= (199-50) + (199-16) = 332
Advantages of SCAN Algorithm
Here are some of the advantages of the SCAN Algorithm.
• High throughput
4. C-SCAN
In the SCAN algorithm, the disk arm again scans the path that has been scanned, after
reversing its direction. So, it may be possible that too many requests are waiting at the other
end or there may be zero or few requests pending at the scanned area.
These situations are avoided in the CSCAN algorithm in which the disk arm instead of
reversing its direction goes to the other end of the disk and starts servicing the requests from
there. So, the disk arm moves in a circular fashion and this algorithm is also similar to the
SCAN algorithm hence it is known as C-SCAN (Circular SCAN).
Example:
Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm
is at 50, and it is also given that the disk arm should move “towards the larger value”.
So, the total overhead movement (total distance covered by the disk arm) is calculated as:
= (199-50) + (199-0) + (43-0) = 391
5. LOOK
LOOK Algorithm is similar to the SCAN disk scheduling algorithm except for the difference
that the disk arm in spite of going to the end of the disk goes only to the last request to be
serviced in front of the head and then reverses its direction from there only. Thus, it prevents
the extra delay which occurred due to unnecessary traversal to the end of the disk.
Example:
Suppose the requests to be addressed are-82,170,43,140,24,16,190. And the Read/Write arm
is at 50, and it is also given that the disk arm should move “towards the larger value”.
So, the total overhead movement (total distance covered by the disk arm) is calculated as:
= (190-50) + (190-16) = 314
6. C-LOOK
As LOOK is similar to the SCAN algorithm, in a similar way, C-LOOK is similar to the
CSCAN disk scheduling algorithm. In CLOOK, the disk arm despite going to the end goes
only to the last request to be serviced in front of the head and then from there goes to the
other end’s last request. Thus, it also prevents the extra delay which occurred due to
unnecessary traversal to the end of the disk.
Example:
So, the total overhead movement (total distance covered by the disk arm) is calculated as
= (190-50) + (190-16) + (43-16) = 341