File System Structure and File System Implemetation
File System Structure and File System Implemetation
Hard disks have two important properties that make them suitable for secondary storage of
files in file systems: (1) Blocks of data can be rewritten in place, and (2) they are direct
access, allowing any block of data to be accessed with only (relatively) minor
movements of the disk heads and rotational latency.
Disks are usually accessed in physical blocks, rather than a byte at a time. Block
sizes may range from 512 bytes to 4K or larger.
File systems organize storage on disk drives, and can be viewed as a layered design:
o At the lowest layer are the physical devices, consisting of the magnetic media,
motors & controls, and the electronics connected to them and controlling them.
Modern disk put more and more of the electronic controls directly on the disk
drive itself, leaving relatively little work for the disk controller card to perform.
o I/O Control consists of device drivers, special software programs ( often written
in assembly ) which communicate with the devices by reading and writing special
codes directly to and from memory addresses corresponding to the controller card's
registers. Each controller card ( device ) on a system has a different set of addresses
( registers, a.k.a. ports ) that it listens to, and a unique set of command codes and
results codes that it understands.
o The basic file system level works directly with the device drivers in terms of
retrieving and storing raw blocks of data, without any consideration for what is in each
block. Depending on the system, blocks may be referred to with a single block
number, ( e.g. block # 234234 ), or with head-sector-cylinder combinations.
o The file organization module knows about files and their logical blocks, and
how they map to physical blocks on the disk. In addition to translating from logical to
physical blocks, the file organization module also maintains the list of free blocks, and
allocates free blocks to files as needed.
o The logical file system deals with all of the meta data associated with a file
( UID, GID, mode, dates, etc ), i.e. everything about the file except the data itself. This
level manages the directory structure and the mapping of file names to file control
blocks, FCBs, which contain all of the meta data as well as block number information
for finding the data on the disk.
The layered approach to file systems means that much of the code can be used uniformly
for a wide
variety of different file systems, and only certain layers need to be file system specific.
Common file systems in use include the UNIX file system, UFS, the Berkeley Fast File
System, FFS, Windows systems FAT, FAT32, NTFS, CD-ROM systems ISO 9660, and for
Linux the extended file systems ext2 and ext3 ( among 40 others supported. )
Overview
o A boot-control block, (per volume) a.k.a. the boot block in UNIX or the
partition boot sector in Windows contains information about how to boot the
system off of this disk. This will generally be the first sector of the volume if
there is a bootable system loaded on that volume, or the block will be left
vacant otherwise.
o A volume control block, (per volume ) a.k.a. the master file table in UNIX or
the superblock in Windows, which contains information such as the partition
table, number of blocks on each file system, and pointers to free blocks and
free FCB blocks.
o A directory structure (per file system), containing file names and pointers to
corresponding FCBs. UNIX uses inode numbers, and NTFS uses a master file
table.
o The File Control Block, FCB, (per file) containing details about ownership,
size, permissions, dates, etc. UNIX stores this information in inodes, and
NTFS in the master file table as a relational database structure.
There are also several key data structures stored in memory:
o A system-wide open file table, containing a copy of the FCB for every
currently open file in the system, as well as some other related information.
A per-process open file table, containing a pointer to the system open file table as
well as some other information. ( For example the current file position pointer
may be either here or in the system file table, depending on the
implementation and whether the file is being shared or not. )
Figure illustrates some of the interactions of file system components when files are
created and/or used:
o When a new file is created, a new FCB is allocated and filled out with
important information regarding the new file. The appropriate directory is
modified with the new file name and FCB information.
o When a file is accessed during a program, the open ( ) system call reads in the
FCB information from disk, and stores it in the system-wide open file table.
An entry is added to the per-process open file table referencing the system-
wide table, and an index into the per-process table is returned by the open( )
system call. UNIX refers to this index as a file descriptor, and Windows refers
to it as a file handle.
o If another process already has a file open when a new request comes in for the
same file, and it is sharable, then a counter in the system-wide table is
incremented and the per-process table is adjusted to point to the existing entry
in the system-wide table.
o When a file is closed, the per-process table entry is freed, and the counter in
the system-wide table is decremented. If that counter reaches zero, then the
system wide table is also freed. Any data currently stored in memory cache for
this file is written out to disk if necessary.
Physical disks are commonly divided into smaller units called partitions. They can
also be combined into larger units, but that is most commonly done for RAID
installations and is left for later chapters.
Partitions can either be used as raw devices (with no structure imposed upon them), or
they can be
formatted to hold a file system ( i.e. populated with FCBs and initial directory
structures as appropriate.)
Raw partitions are generally used for swap space, and may also be used for certain
programs such as
databases that choose to manage their own disk storage system. Partitions containing
filesystems can
generally only be accessed using the file system structure by ordinary users, but can often
be accessed as a
raw device also by root.
The boot block is accessed as part of a raw partition, by the boot program prior to
any operating system being loaded. The root partition contains the OS kernel and at
least the key portions of the OS needed to complete the boot process. At boot time
the root partition is mounted, and control is transferred from the boot program to the
kernel found there. (Older systems required that the root partition lie completely
within the first 1024 cylinders of the disk, because that was as far as the boot
program could reach. Once the kernel had control, then it could access partitions
beyond the 1024 cylinder boundary. )
Virtual File Systems, VFS, provide a common interface to multiple different file system
types. In
addition, it provides for a unique identifier (vnode ) for files across the entire space,
including across all
file systems of different types. (UNIX inodes are unique only across a single file
system, and certainly do
not carry across networked file systems)
Linux VFS provides a set of common functionalities for each file system, using function
pointers
accessed through a table. The same functionality is accessed through the same table
position for all file system types, though the actual functions pointed to by the
pointers may be file system-specific. Common operations provided include open( ),
read( ), write( ), and mmap( ).