0% found this document useful (0 votes)
4 views

Chapter 12 File System Implementation

Chapter 12 discusses the implementation of file systems, including local and remote systems, directory structures, and various allocation methods. It covers the organization of disk structures, in-memory file system structures, and free-space management techniques. Additionally, it addresses performance efficiency, recovery methods, and specific file systems like WAFL and ZFS.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Chapter 12 File System Implementation

Chapter 12 discusses the implementation of file systems, including local and remote systems, directory structures, and various allocation methods. It covers the organization of disk structures, in-memory file system structures, and free-space management techniques. Additionally, it addresses performance efficiency, recovery methods, and specific file systems like WAFL and ZFS.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 53

FILE-SYSTEM

IMPLEMENTATION
Chapter 12
CHAPTER OBJECTIVES

• To describe the details of implementing


local file systems and directory structures.
• To describe the implementation of remote
file systems.
• To discuss block allocation and free-block
algorithms and trade-offs.
TABLE OF CONTENTS
File-System Efficiency and
12.1 12.6
Structure Performance

12.2 File-System
Implementation 12.7 Recovery

Directory
12.3 12.8 NFS
Implementation
Allocation Example: The WAFL
12.4 12.9
Methods File-System
Free-Space 12.1 Summar
12.5
Management 0 y
12.1
FILE-SYSTEM
STRUCTURE
▪ Disks -provide most of the
secondary storage on which file
systems are maintained.
-To improve I/O efficiency, I/O
transfers between memory and
disk are performed in units of
BLOCKS.

▪ Each block has one or more sectors


-Depending on the disk drive,
sector size varies from 32 bytes to
4,096 bytes; the usual size is 512
bytes.

▪ 2 Characteristics of Disk
-Blocks ca be read, modified,
and written (in place)
-Blocks can be directly in any
order.
File Systems
⮚ Provide efficient and convenient access to
the disk by allowing data to be stored,
located, and retrieved easily.
⮚ File system design problems:
-defining how the file system should look to
the user.
-creating algorithms and data structures to
map the logical file system onto the
physical secondary-storage devices.
File System Layers:

I/O Control (Input/Output)


⮚ This level consists of device drivers and interrupt handlers
to transfer information between the main memory and the
disk system.

Basic File System


⮚ It needs only to issue generic commands to the
appropriate device driver to read and write physical blocks
on the disk.
⮚ Each physical block is identified by its numeric disk
address (for example, drive 1, cylinder 73, track 2, sector
10).
File-organization Module
⮚ knows about files and their logical blocks, as well as
physical blocks.
⮚ also includes the free-space manager, which tracks
unallocated blocks and provides these blocks to the file-
organization module when requested.

Logical File System


⮚ It manages metadata information.
⮚ Metadata includes all of the file-system structure except
the actual data (or contents of the files).
⮚ The logical file system manages the directory structure to
provide the file-organization module with the information
the latter needs, given a symbolic file name.
⮚ It maintains file structure via file-control blocks.
File-control Block (FCB)
▪ (an inode in UNIX file
systems) contains information
about the file, including
ownership, permissions, and
location of the file contents.

▪ UNIX uses the UNIX file


system (UFS), which is based
on the Berkeley Fast File
System (FFS).

▪ Windows supports disk file-


system formats of FAT, FAT32,
and NTFS (or Windows NT File
System), as well as CD-ROM
and DVD file-system formats.
12.2

FILE-SYSTEM
IMPLEMENTATION
Disk Structures
• A boot control block (per volume) can contain
information needed by the system to boot an
operating system from that volume. If the disk
does not contain an operating system, this block
can be empty. It is typically the first block of a
volume. In UFS, it is called the boot block. In
NTFS, it is the partition boot sector.

• A volume control block (per volume) contains


volume (or partition) details, such as the number
of blocks in the partition, the size of the blocks, a
free-block count and free-block pointers, and a
free-FCB count and FCB pointers. In UFS, this is
called a superblock. In NTFS, it s stored in the
master file table.
Disk Structures
• A directory structure (per file
system) is used to organize the files.
In UFS, this includes file names and
associated inode numbers. In NTFS,
it is stored in the master file table.

• A per-file FCB contains many


details about the file. It has a unique
identifier number to allow
association with a directory entry. In
NTFS, this information is actually
stored within the master file table,
which uses a relational database
structure, with a row per file.
In-Memory File Systems Structures
▪An in-memory mount table
contains information about
each mounted volume.

▪An in-memory directory-


structure cache holds the
directory information of
recently accessed
directories. (For directories
at which volumes are
mounted, it can contain a
pointer to the volume table.)
In-Memory File Systems Structures
▪ The system-wide open-file
table contains a copy of the FCB
of each open file, as well as other
information.

▪ The per-process open-file table


contains a pointer to the
appropriate entry in the system-
wide open-file table, as well as
other information.

▪ Buffers hold file-system blocks


when they are being read from
disk or written to disk.
Partitions and
Mounting Exploded View of Hard
▪ Disk can be subdivided into Partitions . Drive
▪ Disks or partitions can be RAID protected
against failure
▪ Disk or partition can be used Raw – without
a file system or Formatted with a file
system .
▪ Partitions also known as minidisks, slices.

▪ Entity containing file system known as a


Volume.
▪ Each volume containing file system also
tracks that file systemʼs info in device
directory or volume table of contents

▪ Boot Loader knows enough about the file-


system structure to be able to find and load
the kernel and start it executing.
Partitions and
Mounting
Root Partition
▪ It contains the operating-system kernel and sometimes other
system files mounted at boot time.

▪ As part of a successful mount operation, the operating system


verifies that the device contains a valid file system.

▪ It does so by asking the device driver to read the device directory


and verifying that the directory has the expected format. If the
format is invalid, the partition must have its consistency checked
and possibly corrected, either with or without user intervention.

▪ Finally, the operating system notes in its in-memory mount table


that a file system is mounted, along with the type of the file
system. The details of this function depend on the operating
system.
Virtual File
Systems

▪ Virtual File Systems (VFS)


provide an object-oriented
way of implementing file
systems.

▪ VFS allows the same system


call interface (the API) to be
used for different types of
file systems.

▪ The API is to the VFS


interface, rather than any
specific type of file system.
Virtual File System (VFS) Layer
Functions:
1.It separates file-system-generic operations from their
implementation by defining a clean VFS interface.

2.It provides a mechanism for uniquely representing a file


throughout a network. The VFS is based on a file-representation
structure, called a vnode, that contains a numerical designator
for a network-wide unique file.

▪ (UNIX inodes are unique within only a single file system.) This
network-wide uniqueness is required for support of network file
systems. The kernel maintains one vnode structure for each
active node (file or directory).
12.3 DIRECTORY IMPLEMENTATION

Linear list of file names with pointer to the data


blocks.

▪ simple to program
▪ time-consuming to execute

Hash Table – linear list with hash data


structure.

▪ decreases directory search time


▪ collisions – situations where two file names
hash to the same location
▪ fixed size
12.4 ALLOCATION
METHODS
CONTIGUOUS ALLOCATION

Contiguous Allocation Basics:


-Files are stored in consecutive blocks on
the disk, which minimizes disk head
movement and speeds up access.

Accessing Files:
-Sequential and direct file access are
efficient with contiguous allocation since
the next block is adjacent to the current
one.
LINKED
ALLOCATION
Linked Allocation Basics:
-Each file is stored as a linked list of disk
blocks, which can be scattered across the
disk.
-The directory contains pointers to the first
and last blocks of a file, and each block has a
pointer to the next block.

Advantages of Linked Allocation:


-No External Fragmentation
-File Size Flexibility
-No Disk Compaction Needed

Disadvantages of Linked Allocation:


-Sequential Access Only
-Space Overhead for Pointers
-Reliability Issues
INDEXED ALLOCATION

▪ Indexed allocation bring all the pointers together into one


location called the INDEX BLOCK.
▪ Each file has its own index block, which is an array of disk-
block addresses.
▪ The ith entry in the index block points to the ith block of the
file.

▪ Solves Direct Access Issue:


-Unlike linked allocation, indexed allocation allows
efficient direct access by bringing all pointers for a file
into a single index block.

▪ How it Works:
-Each file has an index block containing pointers to all
the disk blocks that store the file’s data. The directory
entry points to the index block.
12.5

FREE-SPACE
MANAGEMENT
▪ To keep track of free disk space, the system maintains a
free-space list.

▪ The free-space list records all free disk blocks—those


not allocated to some file or directory.

▪ To create a file, we search the free-space list for the


required amount of space and allocate that space to the
new file.

▪ This space is then removed from the free-space list.

▪ When a file is deleted, its disk space is added to the


free-space list.
Bit Vector
▪ The free-space list is implemented as a bit map or bit
vector.
▪ Each block is represented by 1 bit. If the block is free, the
bit is 1; if the bl
▪ ock is allocated, the bit is 0.

▪ For example, consider a disk where blocks 2, 3, 4, 5, 8, 9,


10, 11, 12, 13, 17, 18, 25, 26, and 27 are free and the
rest of the blocks are allocated. The free-space bit map
would be

001111001111110001100000011100000 ...

▪ The calculation of the block number is


Linked List
▪ Another approach to free-space
management, keeping a pointer to the
first free block in a special location on
the disk and caching it in memory.

▪ This first block contains a pointer to


the next free disk block, and so on.

▪ The operating system simply needs a


free block so that it can allocate that
block to a file, so the first block in the
free list is used.
Grouping
▪ A modification of the free-list approach stores
the addresses of n free blocks in the first free
block.

▪ The first n−1 of these blocks are actually free.

▪ The last block contains the addresses of another


n free blocks, and so on.

▪ The addresses of a large number of free blocks


can now be found quickly, unlike the situation
Counting
▪ Another approach takes advantage of the fact that,
generally, several contiguous blocks may be allocated
or freed simultaneously, particularly when space is
allocated with the contiguous-allocation algorithm or
through clustering.

▪ Thus, rather than keeping a list of n free disk


addresses, we can keep the address of the first free
block and the number (n) of free contiguous blocks
that follow the first block.

▪ Each entry in the free-space list then consists of a disk


address and a count.
Space Maps

▪ ZFS creates metaslabs to divide the space on the


device into chunks of manageable size. A given volume
may contain hundreds of metaslabs.

▪ Each metaslab has an associated space map. ZFS uses


the counting algorithm to store information about free
blocks. It uses log-structured file-system techniques to
record them.

▪ The space map is a log of all block activity (allocating and


freeing), in time order, in counting format.
12.6EFFICIENCY AND PERFORMANCE

Efficiency dependent on:


-disk allocation and directory algorithms
-types of data kept in fileʼs directory entry

Performance
-disk cache – separate section of main memory for frequently used
blocks
-free-behind and read-ahead – techniques to optimize sequential access
-improve PC performance by dedicating section of memory as virtual
disk, or RAM disk
BUFFER CACHE
⮚ Some systems maintain a separate section of main memory
⮚ Blocks are kept under the assumption that they will be used
again shortly.

PAGE CACHE
⮚ uses virtual memory techniques to cache file data as pages
rather than as file-system-oriented blocks.

UNIFIED VIRTUAL MEMORY


⮚ Caching file data using virtual addresses is far more efficient
than caching through physical disk blocks, as accesses interface
with virtual memory rather than the file system.
⮚ Several systems—including Solaris, Linux, and Windows —use
page caching to cache both process pages and file data.
▪ Some versions of UNIX and Linux provide a
Unified Buffer Cache.

▪ Two alternatives for opening and accessing a


file.
▪ One approach is to use memory
mapping ,
▪ The second is to use the standard
system calls read() and write().

▪ Here, the read() and write() system calls


go through the buffer cache.

▪ A memory mapping proceeds by reading in


disk blocks from the file system and storing
them in the buffer cache.

▪ This situation, known as Double Caching,


▪ Synchronous writes occur in the
order in which the disk subsystem
receives them, and the writes are
not buffered.

▪ Aynchronous write occur when


the data are stored in the cache,
and control returns to the caller.

▪ Free-behind removes a page from


the buffer as soon as the next page
is requested.

▪ With read-ahead, a requested


page and several subsequent pages
are read and cached.
12.7RECOVERY

▪ Consistency checking – compares data in directory


structure with data blocks on disk, and tries to fix
inconsistencies .

▪ Consistency checker-a systems program such as fsck in


UNIX— compares the data in the directory structure with
the data blocks on disk and tries to fix any inconsistencies
it finds.

▪ Use system programs to back up data from disk to


another storage device (floppy disk, magnetic tape, other
magnetic disk, optical) .
Log-Structured File Systems
▪ Log structured (or journaling) file systems record each update to
the file system as a transaction.
-Each set of operations for performing a specific task

▪ Circular buffer-writes to the end of its space and then continues at


the beginning, overwriting older values as it goes.

▪ All transactions are written to a log .


-A transaction is considered committed once it is written to the
log
-However, the file system may not yet be updated

▪ The transactions in the log are asynchronously written to the file


system
▪ When the file system is modified, the transaction is removed from
the log
Other Solutions
Network Appliance’s WAFL file system
▪ These systems never overwrite blocks with new data rather
writes all data and metadata changes to new blocks.
▪ Snapshot is created if the old pointers and blocks are kept
▪ It is a view of the file system before the last update took
place.

Solaris ZFS file system


▪ ZFS takes an even more innovative approach to disk
consistency.
▪ It never overwrites blocks, just like WAFL.
▪ However, ZFS goes further and provides checksumming of all
metadata and data blocks.
Backup and Restore
▪ System programs can be used to back up data from disk to another
storage device, such as a magnetic tape or other hard disk.

▪ Recovery from the loss of an individual file, or of an entire disk, may


then be a matter of restoring the data from backup.

Typical backup schedule:


▪ Day 1. Copy to a backup medium all files from the disk (Full Backup).
▪ Day 2. Copy to another medium all files changed since day
1(Incremental Backup).
▪ Day 3. Copy to another medium all files changed since day 2.
▪ Day N. Copy to another medium all files changed since day N− 1.
Then go back to day 1.
12.8 Network File System-
NFS
▪ Network file systems are commonplace.

▪ They are typically integrated with the overall


directory structure and interface of the client
system.

▪ NFS is both an implementation and a


specification of a software system for remote
files across LANs (or even WANs).

▪ NFS is part of ONC+, which most UNIX vendors


and some PC operating systems support.

▪ The implementation described here is part of


the Solaris operating system, which is a
modified version of UNIX SVR4. It uses either
the TCP or UDP/IP protocol (depending on the
interconnecting network).
Client-Server Model: NFS allows file system
sharing between interconnected workstations
through a client-server relationship.
A machine can act as both a client and a server.

Independent File Systems: Each machine has


its own independent file system, and sharing is
only enabled upon request from the client
machine. The sharing of a remote file system
only affects the client machine.

Mount Operation: To access a remote


directory, a client must perform a mount
operation. This operation links the remote
directory to a local directory on the client
machine, making the remote directory appear as
part of the local file system.
Transparent Access: Once mounted, the
remote directory can be accessed transparently
by users on the client machine as if it were a
local directory. However, the mount operation
itself requires the specification of the remote
directory's location.

Cascading Mounts: In some NFS


implementations, it is possible to mount one
remote file system over another that has already
been mounted remotely, allowing complex file
system hierarchies.

No Transitivity: Mounting a remote file system


does not grant access to other file systems
mounted over the first one. Each machine is
affected only by the mounts it has explicitly
invoked
User Mobility: NFS supports user mobility by
allowing shared file systems (like home
directories) to be mounted across different
machines, enabling users to access their
environment from any workstation.

Heterogeneous Environment: NFS is


designed to work in environments with
different machines, operating systems, and
network architectures, using RPC (Remote
Procedure Call) and XDR (External Data
Representation) to maintain compatibility.

Separate Protocols: NFS separates the


protocols for the mount mechanism and
remote file access. The mount protocol handles
the mounting process, while the NFS protocol
manages remote file access, both
implemented using RPCs.
The Mount Protocol
▪ The mount protocol establishes the initial logical
connection between a server and a client.

▪ In Solaris, each machine has a server process, outside


the kernel, performing the protocol functions.

▪ A mount operation includes the name of the remote


directory to be mounted and the name of the server
machine storing it.

▪ The server maintains an export list that specifies


local file systems that it exports for mounting, along
with names of machines that are permitted to mount
them.
The NFS Protocol
The NFS protocol provides a set of RPCs for
remote file operations. The procedures support
the following operations:

▪ Searching for a file within a directory


▪ Reading a set of directory entries
▪ Manipulating links and directories
▪ Accessing file attributes
▪ Reading and writing files
▪ The client initiates the operation with a
regular system call. The operating-
system layer maps this call to a VFS
operation on the appropriate vnode.

▪ The VFS layer identifies the file as a


remote one and invokes the
appropriate NFS procedure.

▪ An RPC call is made to the NFS service


layer at the remote server.

▪ This call is reinjected to the VFS layer


on the remote system, which finds that
it is local and invokes the appropriate
file-system operation. This path is
retraced to return the result.
Path-Name Translation
▪ Path-name translation in NFS involves the parsing of a
path name such as /usr/local/dir1/file.txt into
separate directory entries, or components: (1) usr, (2)
local, and (3) dir1.

▪ It is done by breaking the path into component names


and performing a separate NFS lookup call for every
pair of component name and directory vnode.

▪ This expensive path-name-traversal scheme is needed,


since the layout of each client’s logical name space is
unique, dictated by the mounts the client has
performed.
Remote Operations
▪ With the exception of opening and closing files, there is an
almost one-to-one correspondence between the regular UNIX
system calls for file operations and the NFS protocol RPCs.

▪ Thus, a remote file operation can be translated directly to


the corresponding RPC. Conceptually, NFS adheres to the
remote-service paradigm; but in practice, buffering and
caching techniques are employed for the sake of
performance.

▪ There are two caches:


-file-attribute (inode-information) cache
-file-blocks cache
12.9

EXAMPLE: THE WALF


FILE-SYSTEM
THE WALF FILE-SYSTEM
▪ The file system is similar to the Berkeley
Fast File System, with many
modifications.

▪ It is block-based and uses inodes to


describe files.

▪ Each inode contains 16 pointers to


blocks (or indirect blocks) belonging to
the file described by the inode.

▪ Each file system has a root inode.

▪ All of the metadata lives in files.

▪ All inodes are in one file, the free-block


map in another, and the free-inode map
in a third.
▪ WAFL’s snapshot facility is very efficient in that it
does not even require that copy-on-write copies of
each data block be taken before the block is
modified. Other file systems provide snapshots,
but frequently with less efficiency.

▪ Newer versions of WAFL actually allow read–write


snapshots, known as CLONES.

▪ CLONES is the newer versions of WAFL that allow


read–write snapshots.

▪ Also efficient, using the same techniques as


shapshots.

▪ The original snapshot is unmodified, still giving a


view into the file system as it was before the clone
was updated.

▪ Clones can also be promoted to replace the


original file system; this involves throwing out all
▪ Replication is the duplication and
synchronization of a set of data over a network to
another system.

▪ First, a snapshot of a WAFL file system is


duplicated to another system.

▪ When another snapshot is taken on the source


system, it is relatively easy to update the remote
system just by sending over all blocks contained
in the new snapshot.

▪ These blocks are the ones that have changed


between the times the two snapshots were taken.

▪ The remote system adds these blocks to the file


system and updates its pointers, and the new
system then is a duplicate of the source system
as of the time of the second snapshot.

▪ ZFS file system supports similarly efficient


snapshots, clones, and replication.
SUMMAR
Y
▪ The file system is stored on secondary storage, like disks, and manages data
across multiple file systems using partitions.

▪ It operates in layers: lower levels handle physical storage, and upper levels
manage logical file names. Different file systems are unified through a Virtual
File System (VFS) layer, allowing consistent access.

▪ Files can be allocated in three ways: contiguous (prone to fragmentation), linked


(inefficient for direct access), and indexed (requires overhead). Optimizations,
like using extents or clusters, improve efficiency.

▪ Free-space management impacts disk performance and reliability, using


methods like bit vectors and linked lists. Directory management focuses on
efficiency, with hash tables being common but vulnerable to damage.

▪ Network file systems (e.g., NFS) enable remote file access but face challenges
with consistency and performance. Reliability is enhanced through techniques
like log structures and RAID, while performance is boosted by caching and
specialized file systems like WAFL.
THANKYOU
FOR
LISTENING!

You might also like