0% found this document useful (0 votes)
28 views

File System

Uploaded by

zahid
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

File System

Uploaded by

zahid
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 46

File System Implementation

Yejin Choi ([email protected])


Layered File System
• Logical File System
– Maintains file structure via
FCB (file control block)
• File organization module
– Translates logical block to
physical block
• Basic File system
– Converts physical block to disk
parameters (drive 1, cylinder
73, track 2, sector 10 etc)
• I/O Control
– Transfers data between
memory and disk
Physical Disk Structure
• Parameters to read
from disk:
– cylinder(=track) #
– platter(=surface) #
– sector #
– transfer size
File system Units
• Sector – the smallest unit that can be accessed
on a disk (typically 512 bytes)

• Block(or Cluster) – the smallest unit that can


be allocated to construct a file

• What’s the actual size of 1 byte file on disk?


– takes at least one cluster,
– which may consist of 1~8 sectors,
– thus 1byte file may require ~4KB disk space.
Sector~Cluster~File layout
FCB – File Control Block
• Contains file attributes + block locations
– Permissions
– Dates (create, access, write)
– Owner, group, ACL (Access Control List)
– File size
– Location of file contents
• UNIX File System  I-node
• FAT/FAT32  part of FAT (File Alloc. Table)
• NTFS  part of MFT (Master File Table)
Partitions
• Disks are broken into one or more
partitions.
• Each partition can have its own file system
method (UFS, FAT, NTFS, …).
A Disk Layout for A File System

Boot Super File descriptors


File data blocks
block block (FCBs)

• Super block defines a file system


– size of the file system
– size of the file descriptor area
– start of the list of free blocks
– location of the FCB of the root directory
– other meta-data such as permission and times
• Where should we put the boot image?
Boot block
• Dual Boot
– Multiple OS can be installed in one machine.
– How system knows what/how to boot?

• Boot Loader
– Understands different OS and file systems.
– Reside in a particular location in disk.
– Read Boot Block to find boot image.
Block Allocation
• Contiguous allocation
• Linked allocation
• Indexed allocation
Contiguous Block Allocation
Contiguous Block Allocation
• Pros:
– Efficient read/seek.
Why?
 disk location for both
sequential & random
access can be obtained
instantly.
 Spatial locality in disk
Contiguous Block Allocation
• Pros:
– Efficient read/seek. Why?
 disk location for both
sequential & random access
can be obtained instantly.
 Spatial locality in disk

• Cons:
– When creating a file, we don’t
know how many blocks may
be required…
 what happens if we run out of
contiguous blocks?
– Disk fragmentation!
Linked Block Allocation
Linked Block Allocation
• Pros:
– Less fragmentation
– Flexible file allocation
Linked Block Allocation
• Pros:
– Less fragmentation
– Flexible file allocation

• Cons:
– Sequential read requires
disk seek to jump to the
next block. (Still not too
bad…)
– Random read will be
very inefficient!!
O(n) time seek operation
(n = # of blocks in the file)
Indexed Block Allocation
• Maintain an array of
pointers to blocks.

• Random access
becomes as easy as
sequential access!

• UNIX File System


Free Space Management
• What happens when a file is deleted?
 We need to keep track of free blocks…

• Bit Vector (or BitMap)


• Linked List
Bit Vector (= Bit Map)
Bit Vector (= Bit Map)
• Pros
– Could be very efficient with hardware support
– We can find n number of free blocks at once.
• Cons
– Bitmap size grows as disk size grows. Inefficient if entire
bitmap can’t be loaded into memory.
Linked List
Linked List
• Pros
– No need to keep global table.

• Cons
– We have to access each block
in the disk one by one to find
more than one free block.
– Traversing the free list may
require substantial I/O
UNIX file layout overview
I-node
• FCB(file control block) of UNIX

• Each i-node contains 15 block pointers


– 12 direct block pointers and 3 indirect
(single,double,triple) pointers.

• Block size is 4K
 Thus, with 12 direct pointers, first 48K are
directly reachable from the i-node.
I-node block indexing
I-node addressing space
Recall block size is 4K, then
Indirect block contains 1024(=4KB/4bytes)entries

• A single-indirect block can address


1024 * 4K = 4M data
• A double-indirect block can address
1024 * 1024 * 4K = 4G data
• A triple-indirect block can address
1024 * 1024 * 1024 * 4K = 4T data

Any Block can be found with at most 3 indirections.


File Layout in UNIX
Partition layout in UNIX

• Boot block
• Super block
• FCBs
– (I-nodes in Unix, FAT or MST in Windows)
• Data blocks
Unix Directory
• Internally, same as a file.
• A file with a type field as a directory.
– so that only system has certain access
permissions.
• <File name, i-node number> tuples.
Unix Directory Example
- how to look up /usr/bob/mbox ?
Root Directory Block 132 Block 406
1 . 6 .
I-node 6 I-node 26 26 .
1 .. 1 ..
6 ..
4 bin 26 bob
12 grants
7 dev 17 jeff
81 books
14 lib 132 14 sue
406 60 mbox
9 etc 51 sam
17 Linux
6 usr 29 mark
8 tmp
Looking up Aha!
Looking up bob gives I-node 60
usr gives Relevant Data for has contents
I-node 26
I-node 6 data (bob) /usr/bob is of mbox
is in in block 406
block 132
File System Maintenance
• Format
– Create file system layout: super block, I-nodes…
• Bad blocks
– Most disks have some, increase over age
– Keep them in bad-block list
– “scandisk”
• De-fragmentation
– Re-arrange blocks rather contiguously
• Scanning
– After system crashes
– Correct inconsistent file descriptors
Windows File System
• FAT
• FAT32
• NTFS
FAT
• FAT == File Allocation Table
• FAT is located at the top of the volume.
– two copies kept in case one becomes damaged.

• Cluster size is determined by the size of the


volume.
– Why?
Volume size V.S. Cluster size
Drive Size Cluster Size Number of
Sectors
--------------------------------------- -------------------- ---------------------------
512MB or less 512 bytes 1
513MB to 1024MB(1GB) 1024 bytes (1KB) 2
1025MB to 2048MB(2GB) 2048 bytes (2KB) 4
2049MB and larger 4096 bytes (4KB) 8
FAT block indexing
FAT Limitations
• Entry to reference a cluster is 16 bit
Thus at most 2^16=65,536 clusters accessible.
Partitions are limited in size to 2~4 GB.
Too small for today’s hard disk capacity!

• For partition over 200 MB, performance degrades


rapidly.
Wasted space in each cluster increases.

• Two copies of FAT…


 still susceptible to a single point of failure!
FAT32
Enhancements over FAT

• More efficient space usage


– By smaller clusters.
– Why is this possible? 32 bit entry…
• More robust and flexible
– root folder became an ordinary cluster chain, thus it
can be located anywhere on the drive.
– back up copy of the file allocation table.
– less susceptible to a single point of failure.
NTFS
• MFT == Master File Table
– Analogous to the FAT

• Design Objectives
1) Fault-tolerance
 Built-in transaction logging feature.
2) Security
 Granular (per file/directory) security support.
3) Scalability
 Handling huge disks efficiently.
Bonus Materials

• More details of NTFS


• OS-wide overview of file system
NTFS
• Scalability
– NTFS references clusters with 64-bit addresses.
– Thus, even with small sized clusters, NTFS can map
disks up to sizes that we won't likely see even in the
next few decades.
• Reliability
– Under NTFS, a log of transactions is maintained so
that CHKDSK can roll back transactions to the last
commit point in order to recover consistency within
the file system.
– Under FAT, CHKDSK checks the consistency of
pointers within the directory, allocation, and file tables.
NTFS Metadata Files
NameMFT Description
$MFT Master File Table
$MFTMIRR Copy of the first 16 records of the MFT
$LOGFILE Transactional logging file
$VOLUME Volume serial number, creation time, and dirty flag
$ATTRDEF Attribute definitions
. Root directory of the disk
$BITMAP Cluster map (in-use vs. free)
$BOOT Boot record of the drive
$BADCLUS Lists bad clusters on the drive
$QUOTA User quota
$UPCASE Maps lowercase characters to their uppercase version
NTFS : MFT record
MFT record for directory
Application~ File System Interaction

Process Open file


control table File descriptors
block (system-wide) (Metadata) File system
info
File
descriptors

Open Directories
file
pointer
..
array
.
File data
open(file…) under the hood
1. Search directory
structure for the given file fd = open( FileName, access)
path
2. Copy file descriptors into
in-memory data structure PCB Allocate & link up
data structures
3. Create an entry in
system-wide open-file-
table Open
file
Directory look up
4. Create an entry in PCB table by file path
5. Return the file pointer to
user
Metadata File system on disk
read(file…) under the hood
read( fd, userBuf, size )
PCB
Find open file
descriptor
Open
file read( fileDesc, userBuf, size )
table
Logical  phyiscal

Metadata read( device, phyBlock, size )


Get physical block to sysBuf
Buffer copy to userBuf
cache
Disk device driver

You might also like