0% found this document useful (0 votes)
15 views42 pages

12_File System Implementation[1]

Chapter 14 discusses the implementation of file systems, covering their structure, operations, directory implementation, allocation methods, and free-space management. It highlights the importance of efficiency and performance, as well as recovery techniques, using the WAFL file system as a concrete example. Various allocation methods such as contiguous, linked, and indexed allocation are explored, along with free-space management strategies and their implications on system performance.

Uploaded by

khizar abbas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views42 pages

12_File System Implementation[1]

Chapter 14 discusses the implementation of file systems, covering their structure, operations, directory implementation, allocation methods, and free-space management. It highlights the importance of efficiency and performance, as well as recovery techniques, using the WAFL file system as a concrete example. Various allocation methods such as contiguous, linked, and indexed allocation are explored, along with free-space management strategies and their implications on system performance.

Uploaded by

khizar abbas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

Chapter 14: File System

Implementation
Outline
• File-System Structure
• File-System Operations
• Directory Implementation
• Allocation Methods
• Free-Space Management
• Efficiency and Performance
• Recovery
• Example: WAFL File System
Objectives
• Describe the details of implementing
local file systems and directory
structures
• Discuss block allocation and free-
block algorithms and trade-offs
• Explore file system efficiency and
performance issues
• Look at recovery from file system
failures
• Describe the WAFL file system as a
concrete example
File-System Structure
• File structure
– Logical storage unit
– Collection of related information
• File system resides on secondary
storage (disks)
– Provided user interface to storage,
mapping logical to physical
– Provides efficient and convenient access
to disk by allowing data to be stored,
located retrieved easily
• Disk provides in-place rewrite and
random access
Layered File System
File System Layers
• Device drivers manage I/O devices at the I/O control layer
Given commands like
read drive1, cylinder 72, track 2, sector 10, into memory location 1060
Outputs low-level hardware specific commands to
hardware controller
• Basic file system given command like “retrieve block 123”
translates to device driver
• Also manages memory buffers and caches (allocation,
freeing, replacement)
– Buffers hold data in transit
– Caches hold frequently used data
• File organization module understands files, logical
address, and physical blocks
• Translates logical block # to physical block #
• Manages free space, disk allocation
File System Layers (Cont.)
• Logical file system manages metadata
information
– Translates file name into file number, file
handle, location by maintaining file control
blocks (inodes in UNIX)
– Directory management
– Protection
• Layering useful for reducing complexity and
redundancy, but adds overhead and can
decrease performance
• Logical layers can be implemented by any
coding method according to OS designer
File System Layers (Cont.)
• Many file systems, sometimes more
than one within an operating system
– Each with its own format:
– CD-ROM is ISO 9660;
– Unix has UFS, FFS;
– Windows has FAT, FAT32, NTFS as well
as floppy, CD, DVD Blu-ray,
– Linux has more than 130 types, with
extended file system ext3 and ext4
leading; plus distributed file systems, etc.)
– New ones still arriving – ZFS, GoogleFS,
FUSE
File-System Operations
• We have system calls at the API level, but
how do we implement their functions?
– On-disk and in-memory structures
• Boot control block contains info needed by
system to boot OS from that volume
– Needed if volume contains OS, usually first block
of volume
• Volume control block (superblock, master
file table) contains volume details
– Total # of blocks, # of free blocks, block size, free
block pointers or array
• Directory structure organizes the files
– Names and inode numbers, master file table
File Control Block (FCB)
• OS maintains FCB per file, which
contains many details about the file
– Typically, inode number, permissions,
size, dates
– Example
In-Memory File System Structures
• Mount table contains information
about each mounted volume.
• System-wide open-file table
contains a copy of the FCB of
each file and other info
• Per-process open-file table
contains pointers to appropriate
entries in system-wide open-file
table as well as other info
In-Memory File System Structures (Cont.)

• Figure 12-3(a) refers to opening a file


• Figure 12-3(b) refers to reading a file
Directory Implementation
• Linear list of file names with pointer to the data
blocks
– Simple to program
– Time-consuming to execute
• Linear search time
• Could keep ordered alphabetically via linked list
or use B+ tree
• Hash Table – linear list with hash data structure
– Decreases directory search time
– Collisions – situations where two file names hash to
the same location
– Only good if entries are fixed size, or use chained-
overflow method
Allocation Method
• An allocation method refers to
how disk blocks are allocated for
files:
– Contiguous
– Linked
– File Allocation Table (FAT)
Contiguous Allocation Method

• Each file occupies set of contiguous


blocks
– Best performance in most cases
– Simple – only starting location (block #) and
length (number of blocks) are required
– Problems include:
• Finding space on the disk for a file,
• Knowing file size,
• External fragmentation, need for compaction
off-line (downtime) or on-line
Contiguous Allocation (Cont.)
• Mapping from logical
to physical (block
size =512 bytes)
Q
LA/512
R

• Block to be accessed
= starting address +
Q
• Displacement into
block = R
Linked Allocation
• Each file is a linked list of blocks
• File ends at nil pointer
• No external fragmentation
• Each block contains pointer to next block
• No compaction, external fragmentation
• Free space management system called when
new block needed
• Improve efficiency by clustering blocks into
groups but increases internal fragmentation
• Reliability can be a problem
• Locating a block can take many I/Os and disk
seeks
Linked Allocation Example
• Each file is a linked list of disk blocks:
blocks may be scattered anywhere on
the disk
• Scheme
Linked Allocation (Cont.)
• Mapping
Q
LA/511
R

• Block to be accessed is the Qth


block in the linked chain of blocks
representing the file.
• Displacement into block = R + 1
File-Allocation Table
Indexed Allocation Method

• Each file has its own index block(s) of


pointers to its data blocks
• Logical view

index table
Example of Indexed Allocation
Indexed Allocation – Small Files
• Need index table
• Random access
• Dynamic access without external fragmentation,
but have overhead of index block
• Mapping from logical to physical in a file of
maximum size of 256K bytes and block size of
512 bytes. We need only 1 block for index table
Q
LA/512
R
• Calculation:
– Q = displacement into index table
– R = displacement into block
Indexed Allocation – Large Files
• Mapping from logical to physical in
a file of unbounded length (block
size of 512 words)
– Linked scheme – Link blocks of index
table (no limit on size)
– Multi-level indexing
Indexed Allocation – Two-level Scheme
• Two-level index (4K blocks could store 1,024 four-
byte pointers in outer index -> 1,048,567 data
blocks and file size of up to 4GB)
Q1
LA / (512 x 512)
R1
• Mapping scheme for outer-index:
– Q1 = displacement into outer-index
– R1 is used as follows:
Q2
R1 / 512
R2

– Mapping scheme for index level:


• Q2 = displacement into block of index table
• R2 displacement into block of file
Indexed Allocation – Two-Level Scheme
Free-Space Management
• File system maintains free-space list to track
available blocks/clusters
– (Using term “block” for simplicity)
• Bit vector or bit map (n blocks)
01 2 n-1

1 block[i] free
bit[i] =


0  block[i] occupied
Block number calculation
(number of bits per word) *
(number of 0-value words) +
offset of first 1 bit
Example : Free-Space Management
• Number of bits per word = 32 (i.e., each word
is 32 bits long).
• Number of 0-value words = 3 (meaning the
first three words are filled entirely with 0s).
• Offset of the first 1-bit = 7 (meaning the first
1-bit occurs at the 7th position in the next
word, relative to that word).
• Now, applying the formula:
• Block Number = 32×3+7=96+7=103
• So, the block number would be 103
Free-Space Management
• File system maintains free-space list to
track available blocks
• Bit vector or bit map (n blocks)
• Bit map requires extra space
– Example:
block size = 4KB = 212 bytes
disk size = 240 bytes (1 terabyte)
n = 240/212 = 228 bits (or 32MB)
if clusters of 4 blocks -> 8MB of
memory

• Easy to get contiguous files


Linked Free Space List on Disk

 Linked list (free list)


• Cannot get
contiguous space
easily
• No waste. Linked
Free Space List on
Disk of space
• No need to
traverse the entire
list (if # free blocks
recorded)
Free-Space Management (Cont.)

• Grouping
– Modify linked list to store address of next
n-1 free blocks in first free block, plus a
pointer to next block that contains free-
block-pointers (like this one)
• Counting
– Because space is frequently contiguously
used and freed, with contiguous-
allocation, extents, or clustering
• Keep address of first free block and count of
following free blocks
• Free space list then has entries containing
addresses and counts
TRIMing Unused Blocks

• HDDS overwrite in place so need only free list


• Blocks not treated specially when freed
– Keeps its data but without any file pointers to it,
until overwritten
• Storage devices not allowing overwrite (like
NVM) suffer badly with same algorithm
– Must be erased before written, erases made in
large chunks (blocks, composed of pages) and are
slow
– TRIM is a newer mechanism for the file system to
inform the NVM storage device that a page is free
• Can be garbage collected or if block is free, now
block can be erased
Efficiency and Performance

• Efficiency dependent on:


– Disk allocation and directory algorithms
– Types of data kept in file’s directory entry
– Pre-allocation or as-needed allocation of
metadata structures
– Fixed-size or varying-size data structures
Efficiency and Performance (Cont.)
• Performance
– Keeping data and metadata close together
– Buffer cache – separate section of main memory
for frequently used blocks
– Synchronous writes sometimes requested by
apps or needed by OS
• No buffering / caching – writes must hit disk before
acknowledgement
• Asynchronous writes more common, buffer-able,
faster
– Free-behind and read-ahead – techniques to
optimize sequential access
– Reads frequently slower than writes
Page Cache
• A page cache caches pages
rather than disk blocks using
virtual memory techniques and
addresses
• Memory-mapped I/O uses a page
cache
• Routine I/O through the file
system uses the buffer (disk)
cache
• This leads to the following figure
I/O Without a Unified Buffer Cache
Unified Buffer Cache

• A unified buffer cache uses


the same page cache to cache
both memory-mapped pages
and ordinary file system I/O to
avoid double caching
 But which caches get priority, and
what replacement algorithms to use?
I/O Using a Unified Buffer Cache
Recovery
• Consistency checking – compares
data in directory structure with data
blocks on disk, and tries to fix
inconsistencies
– Can be slow and sometimes fails
• Use system programs to back up data
from disk to another storage device
(magnetic tape, other magnetic disk,
optical)
• Recover lost file or disk by restoring
data from backup
Example: WAFL File System

• Used on Network Appliance “Filers” –


distributed file system appliances
• “Write-anywhere file layout”
• Serves up NFS, CIFS, http, ftp
• Random I/O optimized, write optimized
– NVRAM for write caching
• Similar to Berkeley Fast File System,
with extensive modifications
The WAFL File Layout
Snapshots in WAFL

You might also like