File System Summary Sheet
File System Summary Sheet
LEARNING OBJECTIVES
Physical blocks
Physical blocks in in secondary
Records main memory storage
Directory Access buffers
management method Disk
Blocking scheduling
User and
Free
program Operation
commands file name File Storage
manipulation management
functions
User access File structure File allocation
control
File Organization and Access 5. New records are placed in a log file or transaction file.
File organization refers to the logical structuring of the 6. Batch update is performed to merge the log file with
records as determined by the way in which they are accessed. the master file.
We choose a particular file organization based on 7. Used in batch applications.
8. Not suitable for interactive applications.
1. Short access time
2. Ease of update Indexed sequential file
3. Economy of storage 1. Index provides a look up capability to quickly reach
4. Simple maintenance the vicinity of the desired record.
5. Reliability 2. It contains key field and a pointer to the main file.
We will discuss five types of file organizations: 3. Index is searched to find highest key value that is
equal to or precedes the desired key value.
1. The pile 4. Search continues in the main file at the location
2. The sequential file indicated by the pointer.
3. The indexed sequential file 5. New records are added to an overflow file.
4. The indexed file 6. Record in main file that precedes it is updated to
5. The direct or hashed file contain a pointer to the new record.
The pile The pile organization is shown in Figure 1: 7. The overflow is merged with main file during a batch
update.
8. Multiple indexes for the same key can be set up to
increase efficiency.
n
Index
levels
2 Index Main
1 file
Overflow table
File
File File
Hierarchical or Tree-structured Directory These access rights can be specified to specific users or user
groups or all users.
1. Contains master directory with user directory
underneath it. Record Blocking For I/O to be performed, records must be
2. Each user directory may have subdirectories and files organized as blocks. Given the size of a block, there are
as entries. three methods of blocking that can be used:
7.78 | Unit 7 • Operating System
Track 2
File allocation methods It has three methods as follows:
R6 R7 R8 R9 R9R10 R11 R12 R13
1. Contiguous allocation
Figure 6 Variable blocking: spanned. 2. Chained allocation
3. Indexed allocation
Variable length unspanned blocking Variable length
records (Figure 7) are used, but spanning is not employed. Contiguous allocation Here, a single set of blocks is allo-
There is wasted space in most blocks and limits record cated to a file at the time of creation. Only a single entry
size. in the file allocation table is created consisting of starting
block and length of the file. It exhibits external fragmenta-
R1 R 2 R3 R4 R5 Track 1 tion and performs compaction.
0 1 2 3
File B
: Waste due to record fit to block size
4 5 6 7
File C File A 2 2
: Gaps due to hardware design
8 9 10 11 File B 5 4
: Waste due to block size constraint 12 13 14 15 File C 10 6
from fixed record size.
: Waste due to block fit to track size
File name
0 1 1 2 3
File A Free Space Management
End
Start
4 5 1 6 2 7
In addition to file allocation table, disk allocation table is
8 9 10 11 also required to know what blocks on the disk are avail-
File A 5
able. Some of the free space management techniques are
12 13 -1 14 15
as follows:
Indexed file allocation The file allocation table contains a 1. Bit tables
separate one level index for each file. The index has one 2. Chained free portions
entry for each portion allocated to the file. The file allo- 3. Indexing
cation table contains block number for the index. If a file 4. Free block list
requires n blocks, then n + 1 blocks are used, where the first
Bit tables This method uses a vector containing, one bit for
block contains index information (pointers to data blocks).
each block on the disk. Each entry of a ‘0’ corresponds to a
free block and each ‘1’ corresponds to a block in use.
Index Allocation with Block Pointers
File Allocation Advantage
Table(FAT) 1. Easy to find one or a contiguous group of free blocks.
File Index 2. Smaller in size.
0 1 2 3 Name Block
File B 10 The amount of memory required for a block bitmap will be
4 5 6 7 Disk size in (bytes)
1
8 9 10 2
8 × file system block size
11
6
Chained free portion The free portions may be chained
7 together by using a pointer and length value in each free
portion. This method has negligible space overhead. This
method is suitable for all file allocation methods. The disk
Indexed Allocation with Variable length will become quite fragmented, after some use. It is slower
Portion for individual block file creation and also for deletion.
File allocation table
Indexing It treats the free space as a file and uses an index
(FAT)
table (same as in file allocation). The index should be on the
File name Index block basis of variable size portions rather than blocks.
File
1 2 3 4 File A 14 Free block list Here, each block is assigned a number
7 8
sequentially and the list of the numbers of all free blocks is
5 6
Start block Length maintained in a reserved portion of the disk.
9 10 11 12 2 3
13 14 15 8 2
Volumes
It is a collection of addressable sectors in a secondary mem-
ory that an OS or application can use for data storage. The
Example 1: A direct access of file has fixed size 50 byte sectors in a volume need not be consecutive on a physical
records. Assuming the first record is record 1, the first byte storage device. (a single disk equals one volume).
of record 10 will be at what logical location?
Solution:
Total records = 50 × 10 = 500 Unix File Management
First record is record 1. This record is already read. I-nodes (Index Node)
Logical location of first byte = 500 – 50 = 450
UNIX files are administered by the OS by means of i-node.
The correct logical location = 450 + 1 = 451.
An i-node (index node) is a control structure that contains
Example 2: A sequential access file has fixed-size 32-byte the key information needed by the OS for a particular file.
records. Assuming that the first record is record 0, the first The attributes of the file as well as its permissions and
byte of record 20 will be at what location? other control information are stored in the i-node. The
7.80 | Unit 7 • Operating System
exact i-node structure varies from UNIX implementation to 3. The blocks of a file on disk are not necessarily
another. The FreeBSD i-node structure is shown in Figure 8. contiguous.
4. An indexed method is used to keep track of each file,
File Allocation with i-node includes a number of direct pointers and
1. It is done on a block basis. three indirect pointers.
2. Allocation is dynamic.
Timestamps (4)
Data
Size Data Data
Pointer Pointer
Data
:
:
: Pointer Data
5. The free BSD i-node includes 120 bytes of address •• If the file contains still more blocks, the fifteenth
information that is organized as fifteen 64-bit addresses. address in the i-node points to a triple indirect block
6. The first 12 addresses point to the first 12 data blocks that is a third level of indexing. This block points to
of the file. additional double indirect blocks.
7. If the file requires more than 12 data blocks, one or
The capacity of FreeBSD file with 4 kB block size is shown
more levels of indirection are used as follows:
below:
•• The thirteenth address in the i-node points to a block
on disk that contains the next portion of the index. Level Number of Blocks Number of Bytes
This is referred to as the single indirect block. Direct 12 48 K
•• If the file contains more blocks, the fourteenth Single indirect 512 2M
address in the i-node points to a double indirect Double indirect 512 × 512 = 256 K 1G
block. Each block consists of single indirect blocks,
Triple indirect 512 × 256K = 128 M 512 G
each of which contains pointers to file blocks.
Chapter 5 • File Systems, I/O Systems, Protection and Security | 7.81
The total number of data blocks in a file depends on the Device I/O
capacity of the fixed-size blocks in the system. In FreeBSD, The requested operations and data are converted into appro-
the minimum block size is 4 kB, and each block can hold a priate sequences of I/O instructions, channel commands
total of 512 block addresses. Thus, the maximum size of a and controller orders.
file with this block size is over 500 GB.
Scheduling and Control
Windows NT File System
The actual queuing and scheduling of I/O operations
The windows NT file system provides a combination of reli- occurs at this layer as well as the control of the operations.
ability, compatibility and performance, which are not avail- Interrupts are handled. I/O status is collected and reported.
able in the FAT file system.
1. It will quickly perform standard file operations, such Communication Port
as write, read and search.
2. It also performs file-system recovery on very large User Communication Device
hard disks. process architecture I/O
3. NTFS file system formatting on a volume results in
the creation of several system files and the master file
table (MST), which contains information about all Hardware
Scheduling
the files and folders on the NTFS (Figure 9). and control
Logical I/O
Manages general I/O functions on behalf of user processes,
allowing them to deal with the device in terms of a device iden-
Figure 10 No buffering.
tifier and simple commands, such as open, close, read, write.
7.82 | Unit 7 • Operating System
•
87
40 170
150
36
72
66
15
66
72
87
40
36
15
150
170
Example 5: For the following track requests: 87, 170, 40, 2. Disk arm keeps scanning between two extremes; this
150, 36, 72, 66, 15. (Initially head is at track 60 to the arm may result in wear and tear of the disk assembly.
is moving outwards. 3. Certain requests arriving ahead of the arm position
Total head movement = (66 – 60) + (72 – 66) + (87 – 72) + would get immediate service but some other requests
(150 – 87) + (170 – 150) + (180 – 170) + (180 – 40) + (40 that arrive behind the arm position will have to wait
– 36) + (36 – 15) for the arm to return back.
= 6 + 6 + 15 + 63 + 20 + 10 + 140 + 4 + 21
C–SCAN algorithm (one–way elevator algorithm) It treats
= 285 cylinders
the cylinder as a circular list. The head sweeps from the
285
Average head movement = = 35.6 cylinders innermost cylinder to the outermost cylinder, satisfying the
8 waiting requests in order of their locations. When it reaches
Advantages the outermost cylinder, it sweeps back to the innermost cyl-
inder without satisfying any requests and then starts again.
1. Throughput better than FIFO.
2. Basic for most scheduling algorithms. Example 6: Consider the cylinders requests:
3. Eliminates the discrimination. 87, 170, 40, 150, 36, 72, 66, 15 Starting cylinder = 60th
4. No starvation. (arm moving outwards)
Total head movement = (66 – 60) + (72 – 66) + (87 – 72) +
Disadvantages (150 – 87) + (170 – 150) + (180 – 170) + (180 – 0) + (15 – 0)
1. Because of the continuous scanning of disk from end + (36 – 15) + (40 – 36)
to end, the outer tracks are visited less often than the = 6 + 6 + 15 + 63 + 20 + 10 + 180 + 15 + 21 + 4 = 340
mid-range tracks. 340
Average head movement = = 35.6 = 42.5
8
66
72
87
150
170
15
36
180
66
72
87
150
170
180
36
15 40
66
72
87
150
15 170
40
36
RAID Level 0:
Figure 18 Non-redundant (RAID0) 6. Here redundancy is achieved through hamming code.
1. It does not include redundancy. 7. N + m disks required.
2. N disks are required.
3. Data available in RAID level 0 is lower than single disk. RAID Level 3:
4. It has high data transfer capacity.
5. It has high I/O request rate.
RAID Level 1:
Strip 3 b0 b1 b2 b3 p(b)
Strip 0 Strip 1 Strip 2
Strip 4 Strip 5 Strip 6 Strip 7
Strip 8 Strip 9 Strip 10 Strip 11
MRU LRU
Re-reference:
count unchanged
Block 0 Block 1 Block 2 Block 3 P (0 − 3) Re-reference:
Block 4 Block 5 Block 6 P (4 − 7) Count = count +1
Block 7
Block 8 Block 9 P(8 − 11) Block 10 Block 11 Miss (new block brought in)
Block 12 P(12 − 15) Block 13 Block 14 Block 15 Count = 1
P(16 − 19) Block 16 Block 17 Block 18 Block 19
(A) FIFO
3.
Inference: A threat action where an unauthorized entity Passive attacks
indirectly accesses sensitive data by reasoning from 1. These are in the nature of monitoring of the
characteristics or by-products of communications. transmissions.
4.
Intrusion: An unauthorized entity gains access to 2. Attackers obtain information that is being transmitted.
sensitive data by circumventing a system’s security 3. Two types of passive attacks:
protections. •• Release of message contents.
•• Traffic analysis
Deception Threat to either system integrity or data integ-
4. These are very difficult to detect because they do not
rity. Types of attacks that can result are as follows:
involve any alteration of the data.
1. Masquerade: An unauthorized entity gains access to Active attacks
a system or performs a malicious act by posing as an 1. These attacks involves some modification of the date
authorized entity. stream or the creation of a false stream and can be
2. Falsification: False data deceive an authorized entity. subdivided into four categories:
3. Repudiation: An entity deceives another by falsely •• Replay
denying responsibility for an act. •• Masquerade
•• Modification of messages
Disruption A circumstance or event that interrupts or pre-
•• Denial of service
vents the correct operation of system services and func- 2. It is difficult to prevent active attacks absolutely.
tions. Attacks for this threat are as follows:
Intruders Three types of intruders:
1.
Incapacitation: Prevents or interrupts system
operation by disabling a system component. Masquerader An individual who is not authorized to use the
2.
Corruption: Undesirably alters system operation by computer and who penetrates a systems access controls to
adversely modifying system functions or data. exploit a legitimate user’s account. He is likely to be an outsider.
3.
Obstruction: A threat action that interrupts delivery
of system service by hindering system operation. Misfeasor A legitimate user who accesses data, programs
or resources for which such access is not authorized or who
Usurpation A circumstance or event that results in control is authorized for such access but misuses his or her privi-
of system services or functions by an unauthorized entity. leges (generally an insider).
Attacks with this threat are as follows:
Clandestine user An individual who seizes supervisory
1.
Misappropriation: An entity assumes unauthorized control of the system and uses this control to evade auditing
logical or physical control of a system resource. and access controls or to suppress audit collection (either
2.
Misuse: Causes a system component to perform outsider or insider).
a function or service that is detrimental to system
security. Hackers Those who hack into computers do so for the thrill
of it or for status. Attackers often took for targets of oppor-
Threats and assets The assets of a computer are as follows: tunity and then share the information with others.
1. Hardware
Criminals Organized group of hackers have become a
2. Software
widespread and common threat to internet based systems.
3. Data
4. Communication lines Malicious software overview The most sophisticated types
Hardware A major threat to computer system hardware is of threats to computer systems are presented by programs
the threat to availability (e.g., theft of CD-ROMS). that exploit vulnerabilities in computing systems. These
threats are referred as malicious software (or) malware.
Software
1. It is designed to cause damage to or use up the
1. A key threat to software is an attack on availability
resources of a target computer.
(e.g., deletion of software).
2. There are two types of malicious software:
2. A threat to integrity.
•• Those that need a host program.
3. A threat to confidentiality.
Example: Viruses, logic bombs.
Data Threats to data are an attack on •• Those that are independent.
1. Availability Example: Worms, bot programs.
2. Confidentiality 3. We can also differentiate between two types of
3. Integrity software threats:
Communication lines and networks Two types of attacks: •• That do not replicate. These programs are activated
1. Passive attacks by a trigger.
2. Active attacks Example: Logic bombs, backdoors.
7.90 | Unit 7 • Operating System
Nature of Viruses Bots A bot is a program that secretly takes over another
A virus can do anything that other programs do. It attaches internet attached computer and then uses that computer
itself to another program and executes secretly when the host to launch attacks that are difficult to trace to the bot’s
program is run. Three parts of computer virus are as follows: creator.
Exercises
Practice Problems 1 (A) 5 (B) 0
Directions for questions 1 to 15: Select the correct alterna- (C) 10 (D) 13
tive from the given choices. 5. A disk is formatted into 40 sectors and 20 tracks. The
1. Given a system using unspanned blocking and 100 byte disk rotates at 200 ms in one revolution. The time taken
blocks. A file contains records 30, 40, 55, 80, 30, 40. by the head to move from the centre to the rim is 10 ms.
What percentage of space will be wasted in the blocks There are three different files stored on the disk:
allocated for the file? File P : Sector 2, track 4
(A) 31.25% (B) 41.25% File Q : Sector 5, track 1
(C) 51.25% (D) 62.15% File R : Sector 6, track 2
Calculate the average latency time required for the
2. Disk requests come into the disk driver for cylinders
three files.
15, 25, 10, 2, 35, 9, 42 in that order. The disk head is
(A) 22.55 ms (B) 32.22 ms
currently positioned over cylinder 15. A seek takes 6
(C) 21.66 ms (D) 30.22 ms
msec per cylinder moved. What is the total seek time
6. Match the following
using First Come First Served Algorithm?
(A) 750 msec (B) 650 msec (a) RAID0 (1) Parallel access
(C) 550 msec (D) 450 msec (b) RAID1 (2) Striping
(c) RAID2 (3) Use hamming code
3. A Java application needs to load 50 libraries. To load (d) RAID3 (4) Mirrored
each library, one disk access is required. Seek time to (A) a – 2, b – 4, c – 3, d – 1
access the location is 10 ms. Rotational speed is 6000 (B) a – 1, b – 2, c – 3, d – 4
rpm. The total time needed to load all libraries is (C) a – 3, b – 2, c – 4, d – 1
(A) 0.65 sec (B) 0.75 sec (D) a – 4, b – 1, c – 2, d – 3
(C) 0.85 sec (D) 1 sec
7. The correct matching for the following pairs is
4. A program has just read the 13th record in a sequential (A) Disk scheduling (1) Round Robin
access file. If it wants to read the 10th record next, how (B) Batch processing (2) SCAN
many records must the program read to input the tenth (C) Time sharing (3) LIFO
record? (D) Interrupt processing (4) FIFO
Chapter 5 • File Systems, I/O Systems, Protection and Security | 7.91
2. If a process of 200 kB is transferred from backing store 4. Disk scheduling involves deciding
to memory and average disk latency is 10 ms, then (A) which disk should be accessed next
what would be the total swap time, if transfer ratio is 2 (B) the order in which disk access requests must be
Mbps? serviced.
(A) 10 ms
(C) the physical location when files should be ac-
(B) 20 ms
cessed in the disk
(C) 30 ms
(D) 40 ms (D) disk access time and an unused space.
7.92 | Unit 7 • Operating System
5. The root directory of a disk should be placed 10. Which of the following is a program that spreads
(A) at a fixed address in the memory throughout the network?
(B) anywhere on the disk (A) Trojan Horse (B) Virus
(C) at a fixed location on the system disk (C) TSR (D) Worm
(D) at a location on floppy. 11. A program has just read the 15th record in a sequential
6. Direct access methods are not effectively supported by access file. If it wants to read the 10th record next, how
(A) Contiguous allocation many records must the program read to input the tenth
(B) Linked allocation record?
(C) Indexed allocation (A) 0 (B) 5
(D) Sequential allocation (C) 4 (D) 10
7. In which of the following directory systems, it is pos- 12. Formatting of a floppy disk refers to
sible to have multiple paths for a file, starting from the (A) Arranging the data on the disk in contiguous fashion
root directory? (B) Writing the directory
(A) Single-level directory (C) Erasing the system area
(B) Two-level directory (D) Writing identification information on all tracks
(C) Tree-structured directory 13. Sector interleaving in disks is done by
(D) A cyclic graph directory (A) the disk manufacturer
8. The most common system’s security method is: (B) the disk controller card
(A) Passwords (C) the operating system
(B) Key card systems (D) the user
(C) Surveillance system 14. Disk I/O is done in terms of
(D) Lock system (A) Tracks
(B) Blocks
9. Trojan Horse programs
(C) Bits
(A) are legitimate programs that allow unauthorized
(D) Bytes
access.
(B) are hacker programs that do not show up on the 15. How many six-letter passwords can be constructed
system using lowercase letters and digits?
(C) really do not work (A) 266 (B) 106
(D) are immediately discovered (C) 36 6
(D) 356
(A) 8 and 0 (B) 128 and 6 the FAT is 4 bytes in size. Given a 100 × 106 bytes
(C) 256 and 4 (D) 512 and 5 disk on which the file system is stored and data block
5. Consider a disk system with 100 cylinders. The size is 103 bytes, the maximum size of a file that can
requests to access the cylinders occur in following be stored on this disk in units of 106 bytes is ______?
sequence: [2009] [2014]
4, 34, 10, 7, 19, 73, 2, 15, 6, 20 11. Consider a disk pack with a seek time of 4 mil-
liseconds and rotational speed of 10000 rotations
Assuming that the head is currently at cylinder 50,
per minute (RPM). It has 600 sectors per track and
what is the time taken to satisfy all requests if it takes
each sector can store 512 bytes of data. Consider a
1ms to move from one cylinder to adjacent one and
file stored in the disk. The file contains 2000 sectors.
shortest seek time first policy is used?
Assume that every sector access necessitates a seek,
(A) 95 ms (B) 119 ms
and the average rotational latency for accessing each
(C) 233 ms (D) 276 ms
sector is half of the time for one complete rotation.
Common data for questions 6 and 7: A hard disk has The total time (in milliseconds) needed to read the
63 sectors per track, 10 platters each with 2 recording sur- entire file is ________. [2015]
faces and 1000 cylinders. The address of a sector is given 12. Suppose the following disk request sequence (track
as a triple 〈c, h, s〉 , where c is the cylinder number, h is the numbers) for a disk with 100 tracks is given: 45, 20,
surface number and s is the sector number. Thus, the 0th 90, 10, 50, 60, 80, 25, 70. Assume that the initial
sector is addressed as 〈0, 0, 0〉 , the 1st sector as 〈0, 0, 1〉, position of the R/W head is on track 50. The addi-
and so on tional distance that will be traversed by the R/W head
6. The address <400, 16, 29> corresponds to the sector when the Shortest Seek Time First (SSTF) algorithm
is used compared to the SCAN (Elevator) algorithm
number: [2009]
(assuming that SCAN algorithm moves towards 100
(A) 505035 (B) 505036
when it starts execution) is ______ tracks. [2015]
(C) 505037 (D) 505038
7. The address of the 1039th sector is [2009] 13. Consider a typical disk that rotates at 15000 rotations
(A) 〈0, 15, 31〉 (B) 〈0, 16, 30〉 per minute (RPM) and has a transfer rate of 50 × 106
(C) 〈0, 16, 31〉 (D) 〈0, 17, 31〉 bytes/sec. If the average seek time of the disk is twice
the average rotational delay and the controller’s trans-
8. A file system with 300 GByte disk uses a file descrip- fer time is 10 times the disk transfer time, the average
tor with 8 direct block addresses, 1 indirect block time (in milliseconds) to read or write a 512-byte sec-
address and 1 doubly indirect block address. The size tor of the disk is ________ [2015]
of each disk block is 128 bytes and the size of each
disk block address is 8 bytes. The maximum possible 14. Consider a disk queue with requests for I/O to blocks
on cylinders 47, 38, 121, 191, 87, 11, 92, 10. The
file size in this file system is [2012]
C-LOOK scheduling algorithm is used. The head is
(A) 3 Kbytes
initially at cylinder number 63, moving towards larger
(B) 35 Kbytes
cylinder numbers on its servicing pass. The cylinders
(C) 280 Kbytes
are numbered from 0 to 199. The total head move-
(D) dependent on the size of the disk
ment (in number of cylinders) incurred while servic-
9. Consider a hard disk with 16 recording surfaces ing these requests is _____. [2016]
(0–15) having 16384 cylinders (0–16383) and each 15. In a file allocation system, which of the following
cylinder contains 64 sectors (0–63). Data stor- allocation scheme(s) can be used if no external frag-
age capacity in each sector is 512 bytes. Data are mentation is allowed? [2017]
organized cylinder-wise and the addressing format I. Contiguous
is <cylinder no., surface no., sector no.>. A file of II. Linked
size 42797 KB is stored in the disk and the starting III. Indexed
disk location of the file is <1200, 9, 40>. What is the (A) I and III only (B) II only
cylinder number of the last sector of the file, if it is (C) III only (D) II and III only
stored in a contiguous manner?[2013]
16. Consider a storage disk with 4 platters (numbered as
(A) 1281 (B) 1282
0, 1, 2 and 3), 200 cylinders (numbered as 0, 1, ...,
(C) 1283 (D) 1284
199), and 256 sectors per track (numbered as 0, 1,
10. A FAT (File allocation table)-based file system is ..., 255). The following 6 disk requests of the form
being used, and the total over head of each entry in [sector number, cylinder number, platter number] are
7.94 | Unit 7 • Operating System
received by the disk controller at the same time: reversing the direction of the head movement once is
[120, 72, 2], [180, 134, 1], [60, 20, 0], [212, 86, 3], 15 milliwatts. Power dissipation associated with rota-
[56, 116, 2], [118, 16, 1] tional latency and switching of head between different
platters is negligible.
Currently the head is positioned at sector number 100
of cylinder 80, and is moving towards higher cylinder The total power consumption in milliwatts to satisfy
numbers. The average power dissipation in moving all of the above disk requests using the Shortest Seek
the head over 100 cylinders is 20 milliwatts and for Time First disk scheduling algorithm is _________.
[2018]
Answer Keys
Exercises
Practice Problems 1
1. A 2. A 3. B 4. C 5. C 6. A 7. C 8. A 9. C 10. D
11. C 12. B 13. C 14. B 15. A
Practice Problems 2
1. B 2. C 3. D 4. B 5. C 6. D 7. C 8. A 9. A 10. D
11. D 12. D 13. C 14. B 15. C