Unit 4 File System
Unit 4 File System
1. Disk Formatting
2. Boot Blocks
3. Bad Blocks
1. FORMATTING
Disk Formatting
Low level Formatting/Physical Formatting:
● A new magnetic disk is a blank slate: it is just a platter of a magnetic recording material. Before a disk can store data,
it must be divided into sectors that the disk controller can read and write.
● Low-level formatting fills the disk with a special data structure for each sector.
● The data structure for a sector typically consists of a header, a data area (usually 512 bytes in size), and a trailer. The
header and trailer contain information used by the disk controller, such as a sector number and an error-correcting
code (ECC).
● When the controller writes a sector of data during normal I/O, the ECC is updated with a value calculated from all
the bytes in the data area. When the sector is read, the ECC is recalculated and compared with the stored value.
● If the stored and calculated numbers are different, this mismatch indicates that the data area of the sector has
become corrupted and that the disk sector may be bad.
Formatting by OS:
● The first step is to partition the disk into one or more groups of cylinders. The operating system can treat each
partition as though it were a separate disk. For instance, one partition can hold a copy of the operating system’s
executable code, while another holds user files.
● The second step is logical formatting, or creation of a file system. In this step, the operating system stores the initial
file-system data structures onto the disk. These data structures may include maps of free and allocated space and an
initial empty directory. To increase efficiency, most file systems group blocks together into larger chunks, frequently
called clusters
● Some operating systems give special programs the ability to use a disk partition as a large sequential array of logical
blocks, without any file-system data structures. This array is sometimes called the raw disk, and I/O to this array is
termed raw I/O
2. BOOT BLOCK
● For a computer to start running—for instance, when it is powered up or rebooted—it must have an initial program to run. This
initial bootstrap program tends to be simple. It initializes all aspects of the system, from CPU registers to device controllers and
the contents of main memory, and then starts the operating system.
● To do its job, the bootstrap program finds the operating-system kernel on disk, loads that kernel into memory, and jumps to an
initial address to begin the operating-system execution
● The bootstrap is stored in read-only memory (ROM). This location is convenient, because ROM needs no initialization and is at a
fixed location that the processor can start executing when powered up or reset
● The problem is that changing this bootstrap code requires changing the ROM hardware
chips. For this reason, most systems store a tiny bootstrap loader program in the boot ROM
whose only job is to bring in a full bootstrap program from disk.
● The full bootstrap program can be changed easily: a new version is simply written onto the
disk. The full bootstrap program is stored in the “boot blocks” at a fixed location on the disk.
A disk that has a boot partition is called a boot disk or system disk.
● The code in the boot ROM instructs the disk controller to read the boot blocks into memory
(no device drivers are loaded at this point) and then starts executing that code. The full
bootstrap program is more sophisticated than the bootstrap loader in the boot ROM. It is able
to load the entire operating system from a non-fixed location on disk and to start the
operating system running.
● The Windows system places its boot code in the first sector on the hard disk, which it terms
the master boot record, or MBR. Booting begins by running code that is resident in the
system’s ROM memory. This code directs the system to read the boot code from the MBR
3. BAD BLOCKS
Because disks have moving parts and small tolerances (recall that the disk head flies just above the disk
surface), they are prone to failure. Sometimes the failure is complete; in this case, the disk needs to be
replaced and its contents restored from backup media to the new disk. More frequently, one or more
sectors become defective. Most disks even come from the factory with bad blocks. Depending on the disk
and controller in use, these blocks are handled in a variety of ways.
1. One strategy is to scan the disk to find bad blocks while the disk is being formatted. Any bad blocks
that are discovered are flagged as unusable so that the file system does not allocate them
3. As an alternative to sector sparing, some controllers can be instructed to replace a bad block by sector
slipping. Here is an example: Suppose that logical block 17 becomes defective and the first available spare
follows sector 202. Sector slipping then remaps all the sectors from 17 to 202, moving them all down one
spot. That is, sector 202 is copied into the spare, then sector 201 into 202, then 200 into 201, and so on,
until sector 18 is copied into sector 19. Slipping the sectors in this way frees up the space of sector 18 so
that sector 17 can be mapped to it.
● File systems provide efficient and convenient access to the disk by allowing data to be stored,
located, and retrieved easily.
● A file system poses two quite different design problems.
○ The first problem is defining how the file system should look to the user. This task involves defining a file and
its attributes, the operations allowed on a file, and the directory structure for organizing files.
○ The second problem is creating algorithms and data structures to map the logical file system onto the
physical secondary-storage devices.
● The I/O control level consists of device drivers and interrupt handlers to transfer information
between the main memory and the disk system. A device driver can be thought of as a translator.
Its input consists of highlevel commands such as “retrieve block 123.” Its output consists of
low-level, hardware-specific instructions that are used by the hardware controller, which
interfaces the I/O device to the rest of the system. The device driver usually writes specific bit
patterns to special locations in the I/O controller’s memory to tell the controller which device
location to act on and what actions to take.
● The basic file system needs only to issue generic commands to the appropriate device driver to
read and write physical blocks on the disk. Each physical block is identified by its numeric disk
address (for example, drive 1, cylinder 73, track 2, sector 10). This layer also manages the memory
buffers and caches that hold various file-system, directory, and data blocks. A block in the buffer is
allocated before the transfer of a disk block can occur. When the buffer is full, the buffer manager
must find more buffer memory or free
● The file-organization module knows about files and their logical blocks, as well as physical blocks.
By knowing the type of file allocation used and the location of the file, the file-organization module
can translate logical block addresses to physical block addresses for the basic file system to
transfer. Each file’s logical blocks are numbered from 0 (or 1) through N. Since the physical blocks
containing the data usually do not match the logical numbers, a translation is needed to locate each
block. The file-organization module also includes the free-space manager, which tracks unallocated
blocks and provides these blocks to the file-organization module when requested.
● Finally, the logical file system manages metadata information. Metadata includes all of the
file-system structure except the actual data (or contents of the files). The logical file system
manages the directory structure to provide the file-organization module with the information the
latter needs, given a symbolic file name. It maintains file structure via file-control blocks.
● When a layered structure is used for file-system implementation, duplication of code is minimized.
The I/O control and sometimes the basic file-system code can be used by multiple file systems.
Each file system can then have its own logical file-system and file-organization modules.
Unfortunately, layering can introduce more operating-system overhead, which may result in
decreased performance.
Allocation Methods
Contiguous Allocation
● Contiguous allocation requires that each file occupy a set of contiguous blocks on the disk. Disk addresses define a linear
ordering on the disk. With this ordering, assuming that only one job is accessing the disk, accessing block b + 1 after block b
normally requires no head movement. When head movement is needed (from the last sector of one cylinder to the first
sector of the next cylinder), the head need only move from one track to the next. Thus, the number of disk seeks required
for accessing contiguously allocated files is minimal, as is seek time when a seek is finally needed.
● The directory entry for each file indicates the address of the starting block and the length of the area allocated for this file.
● Accessing a file that has been allocated contiguously is easy. For sequential access, the file system remembers the disk
address of the last block referenced and, when necessary, reads the next block. For direct access to block i of a file that
starts at block b, we can immediately access block b + i.
Problems:
Linked scheme. An index block is normally one disk block. Thus, it can be read and written directly by
itself. To allow for large files, we can link together several index blocks.
Multilevel index. A variant of linked representation uses a first-level index block to point to a set of
second-level index blocks, which in turn point to the file blocks. To access a block, the operating system
uses the first-level index to find a second-level index block and then uses that block to find the desired
data block.
Combined scheme. Another alternative, used in UNIX-based file systems, is to keep the first, say, 15
pointers of the index block in the file’s inode. The first 12 of these pointers point to direct blocks; that is,
they contain addresses of blocks that contain data of the file. Thus, the data for small files (of no more
than 12 blocks) do not need a separate index block. If the block size is 4 KB, then up to 48 KB of data can
be accessed directly. The next three pointers point to indirect blocks.
Bit Map
● Each block is represented by 1 bit. If the block is free, the bit is 1; if the block is allocated, the bit is
0.
● The main advantage of this approach is its relative simplicity and its efficiency in finding the first
free block or n consecutive free blocks on the disk.
● Finding first free block:
● Sequentially check each word in the bit map to see whether that value is not 0, since a 0-valued
word contains only 0 bits and represents a set of allocated blocks. The first non-0 word is scanned
for the first 1 bit, which is the location of the first free block.
● Bit vectors are inefficient unless the entire vector is kept in main memory. (not possible for large
vectors)
Linked List
Another approach to free-space management is to link together all the free disk blocks, keeping a pointer
to the first free block in a special location on the disk and caching it in memory. This first block contains a
pointer to the next free disk block, and so on.
This scheme is not efficient; to traverse the list, we must read each block, which requires substantial I/O
time