0% found this document useful (0 votes)
1K views

File System: File System Management and Optimization Example File Systems

The document discusses file system management and optimization. It covers managing disk space through techniques like choosing an optimal block size, tracking free blocks, enforcing disk quotas, and performing backups. It also discusses maintaining file system consistency after crashes by checking for issues like missing or duplicate blocks, and validating directory entries match file i-nodes. The goal is to efficiently store information, make it accessible to users, and ensure data integrity.

Uploaded by

Danh Vô
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views

File System: File System Management and Optimization Example File Systems

The document discusses file system management and optimization. It covers managing disk space through techniques like choosing an optimal block size, tracking free blocks, enforcing disk quotas, and performing backups. It also discusses maintaining file system consistency after crashes by checking for issues like missing or duplicate blocks, and validating directory entries match file i-nodes. The goal is to efficiently store information, make it accessible to users, and ensure data integrity.

Uploaded by

Danh Vô
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

File System

File System Management


and Optimization
Example File Systems
Review
• File System
– Manages the information stored and hides the complexity on the
storage devices
– Provides the users access information in a convenient way and a
uniform logical view
– Is stored on disks
– Implementations
• File: Contiguous, Linked List, Linked List using a Table in Memory (FAT),
i-nodes (same as multi-level paging)
• Directory: Fixed size entries, one per file containing file name & attributes
• File Name: fixed size, part of block, stored in heap
• Utilities:
– LFS (write data to end the log and using cleaner thread)
– JFS (store log file and using atomic transaction)
– VFS (uniform logical view for file system)
Review
• File
– Abstraction, Object and Mechanism
– Name, Path (absolute vs. relative), Structure, Type, Access,
Attributes, Operations
– Directory (Single level vs. Hierarchy)
– Sharing
• Traditional /hard linking (using one i-node)
• Symbolic linking (using larger than one i-node that references to path of
file)
Objectives…
• File System Management & Optimization
– Disk Space Management
– File System Backups
– File System Consistency
– File System Performance
– Defragmenting Disks
• Example File Systems
– CD-ROM File Systems
– The MS-DOS File Systems
– The UNIX V7 File Systems
File System Management & Optimization

• Block size Disk Space Management


– Files are stored in fixed-size blocks of bytes, not necessarily
adjacent
– How large the block should be?
• Large
– Is efficiently read, but waste of HDD space
• Small
– Is good use of HDD space
– Span multiple blocks
– Many disk accesses to read a file (reduce performance – waste time)
– A good compromise should be chosen
• Performance and space utilization are inherently in conflict
– Median size better then mean size (~2KB)
• Just read (1KB), just written (2.3 KB), and read and written (4.2KB)
File System Management & Optimization
Disk Space Management
• Keep track of Free Blocks
– How to keep track of free blocks?
– The first method
• Using a linked list of disk blocks
• Each block holding as many free disk block numbers as will fit
– The second method
• Using the bitmap
• A disk with n blocks requires a bitmap with n bits
• Free blocks are represented by 1s in the map, allocated blocks by 0s
– The bitmap requires less space than linked list model. However, the disk is full,
the linked list require fewer blocks than the bitmap
– If the free blocks tend to come in long runs of consecutive blocks, the free list
system can be modified to keep track of runs of blocks than single blocks
– A basic empty disk represented by two number: address of first free block,
count of free block
– If disk becomes fragmented, keeping track of runs is less efficient than keeping
track of individual blocks
File System Management & Optimization
Disk Space Management

Tanenbaum, Fig. 4-22.


File System Management & Optimization
Disk Space Management
• Disk Quotas
– To prevent people from hogging too much disk space, multi-user OS often
provide a mechanism for enforcing disk quotas
– System administrator assigns each user a maximum allotment of files and
blocks, and the OS makes sure that the users do not exceed their quotas
– How to do
• When a user opens a file, the attributes and disk addresses are located and put
into an open file table in main memory. Among the attributes is an entry telling
who the owner is. Any increases in the file’s size will be charged to the owner’s
quota
• A second table contains the quota record for every user with a currently open,
even if the file was opened by someone else. It is an extract from a quota file on disk
for users whose files are currently open
• When all the file closed, the record is written back to the quota file
• When a new entry is made in the open file table, a pointer to the owner’s quota
record is entered into it, to make it easy to find the various limits
• Every time a block is added to a file, the total number of blocks charged to the
owner is incremented, and a check is made against both hard and soft limits
• When a user attempt to login, the system examines the quota files to see if the
user has exceeded the soft limit for either numbers of files or number of disks block
File System Management & Optimization
Disk Space Management

Tanenbaum, Fig. 4-24.


File System Management & Optimization
Disk Space Management
• File System Backups
– Recover data from disaster or from accidentally
errors
– Making a backup takes a long time and occupies a
large mount of space, so doing efficiently and
conveniently is important
• backup only specific directories
• incremental dumps
– backup only what was modified from the last backup
– can make recovery complicated
• compression
– Physical dump
• Start at block 0 of disk, writes all the disk blocks
onto the output tap in order, and stops when it has
copied the last one
• There is no value in backing up unused disk blocks
• Dumping bad block
• Advantages: simplicity, great speed
• Disadvantages: inability to skip selected
directories, make incremental dumps, and restore
individual files upon request
File System Management & Optimization
Disk Space Management
• File System Backups
– Logical dump
• Start at one or more specified directories and recursively dumps all files and
directories found there that have changed since some given base date
• Dumps all directories (even unmodified ones)
File System Management & Optimization
Disk Space Management
• File System Consistency
– Problems
• Many file system read blocks, modify them, and write them out later
• If the system crashes before all the modified blocks have been written out, the file
system can be left in an inconsistent state
• If some of the blocks that have not been written out are i-node blocks, directory
block, or block containing the free list
– Solutions
• Using a utility program checks file system consistency in booting or after a
crash (ex: scandisk, fsck – fix sick file system)
• There are 2 kinds of consistency: blocks and files
• To check for block
– The program builds two table, each one containing a counter for each block (set 0)
– The counter in 1st table keep track of how many time each block is present in a file
– The counter in 2nd table record how often each block is present in the free list
– The program then reads all the i-nodes using a raw device, which ignores the file structure
and just returns all the disk blocks starting at 0
– At each block is read, its counter in first table is incremented
– The program then examines the free list to find all the block that are not uses
– Each occurrence of a block in the free list results in its counter in the 2 nd table increased
File System Management & Optimization
Disk Space Management

Consistency Tanenbaum, Fig. 4-24.

• Missing block: no real harm, waste space and reduce the capacity of disk
• Solution: the file system checker adds the missing block to the free list
File System Management & Optimization
Disk Space Management

Tanenbaum, Fig. 4-24.

• Missing block: no real harm, waste space and reduce the capacity of disk
• Solution: the file system checker adds the missing block to the free list
File System Management & Optimization
Disk Space Management

• Duplicate in free list


• Solution: rebuild the free list
File System Management & Optimization
Disk Space Management

• Data block is present in two or more files → if the file is removed,


both same block is free at the same time and the free is update to 2 as
figure c
• Solution: allocate the free block and copy one of file blocks to it
File System Management & Optimization
Disk Space Management
• Directory Consistency
– Is used to check for directory and its file in consistency
• It uses a table of counters per files
– These counts start at 1 when a file is created and are incremented each time a
link is made to the file
• Traverse the directory tree
– A counter is incremented when each every i-node in every directory
• When the checker is all done, it has a list, indexed by i-node number, telling
how many directories contain each file
• It then compares these numbers with the link counts stored in the i-node
themselves
– Both counts will agree → consistency
– The link count is higher: the file are removed, i-node will not be removed (is
not serious errors and cause waste space) → fixed by setting the link count
in the i-node to the correct value
– The link count is lower: 2 directory entries are linked to a file, but the i-node
present is only one (disaster and all blocks is release when the file is removed
→ the one of directories points to an unused i-node) → force the link count in
i-node to the actual number of directory entries
File System Management & Optimization
Disk Space Management
• File System Performance
– Access to disk is much slower than access memory because OS
must seek to track and then wait for desired sector to arrive
under the read head
– Many file systems have been designed with various optimizations
to improve performance
– Cache (Block cache or buffer cache)
• Is a collection of blocks that logically belong on the disk but are being
kept in memory
• Check all read request to see if the needed block is in the cache. If it is,
accessing without disk access. Otherwise, reading into the cache, then
coping to wherever it is needed
• To access quickly, the hash table and the linked list is used
• When a block has to be loaded into a full cache, the usual page replacement
algorithms is applied (LRU is best)
File System Management & Optimization
Disk Space Management
• File System Performance
– Cache (Block cache or buffer cache) (cont)
• Problems:
– Is the block likely to be needed again soon?
– Is the block essential to the consistency of the file system?
• Solutions: blocks can be divided into categories such as i-node blocks, indirect
block, directory blocks, full data blocks, and partially full data blocks
– Blocks that will probably not be needed again soon go on the front, rather than the rear
of the LRU list, so their buffers will be reused quickly. Blocks that might be needed
again soon, such as a partly full block that is being written, go to the end of the list, so
they will stay around for a long time
– If the block is essential to the file system consistency, and it has been modified, it
should be written to disk immediately (reduce crash)

Tanenbaum, Fig. 4-28.


File System Management & Optimization
Disk Space Management
• File System Performance
– Block Read Ahead
• Try to get blocks into the cache before they are needed to increase the hit
rate
• Many files are read sequentially. When the file system is asked to produce
block k in a file, it does that, but when it is finished, it makes a sneaky
check in the cache to see if block k + 1 is already there. If it is not, it
schedules a read for block k + 1 in the hope that when it is needed, it will
have already arrived in the cache
• The file system can keep track of the access patterns to each open file.
Initially, the file is given the sequential access mode. Whenever a seek is
done, the bit is cleared. If sequential reads start happening again, the bit is
set once again
File System Management & Optimization
Disk Space Management
• File System Performance
– Block Read Ahead
• Try to get blocks into the cache before they are needed to increase the hit
rate
• Many files are read sequentially. When the file system is asked to
produce block k in a file, it does that, but when it is finished, it makes a
sneaky check in the cache to see if block k + 1 is already there. If it is
not, it schedules a read for block k + 1 in the hope that when it is
needed, it will have already arrived in the cache
• The file system can keep track of the access patterns to each open file.
Initially, the file is given the sequential access mode. Whenever a seek is
done, the bit is cleared. If sequential reads start happening again, the bit is
set once again
File System Management & Optimization
Disk Space Management
• File System Performance
– Reducing Disk Arm Motion
• Put blocks that are likely to be accessed in sequence close to each other,
preferably in the same cylinder
• Keep track of disk storage not in blocks, but in groups of consecutive
blocks
• When allocating blocks, the system attempts to place consecutive blocks
in a file in the same cylinder (rotational positioning)
• Put the i-nodes in the middle of the disk, rather than at the start, thus
reducing the average seek between the i-node and the first block by a
factor of two
• Divide the disk into cylinder groups, each with its own i-node, blocks,
and free list. When creating a new file, any i-node can be chose, but an
attempt is made to find a block in the same cylinder group as the i-node.
If none is available, then a block in a near by cylinder group is used
File System Management & Optimization
Disk Space Management

Tanenbaum, Fig. 4-29.


File System Management & Optimization
Disk Space Management
• File System Performance
– Defragmenting Disks
• Disks become badly fragmented with files and holes all over the place →
giving poor performance when new file is created
→ moving files around o make them contiguous and to put all of the free
space in one or more large contiguous regions on the disk
Example File Systems
CD-ROM File Systems
• Simple because they were designed for write once media
• They have no provision for keeping tracking of free blocks
• Do not have concentric cylinders instead of a single continuous spiral
containing the bits in a linear sequence. The bits along the spiral are
divided into logical blocks
• CD-R
– is possible to add files after the initial burning, but these are simply appended to
the end of the CD-R
– All the free space is in one contiguous chunk at the end of the CD
– Files are never removed (although directory can be updated to hide existing
files)
• ISO 9960
– An International Standard in 1998
– Make every CD-ROM readable on every computer, independent of the byte
ordering used and independent of the OS used
Example File Systems
CD-ROM File Systems
• Rock Ridge Extensions
– In the UNIX community began working on an extension to make it
possible to represent UNIX file systems on a CD-ROM
– Use the System use field in order to make Rock Ridge CD-ROMs
readable on any computer
– Any system not aware of the Rock Ridge extensions just ignores
them and sees a normal CD-ROM
– Rock Ridge extension fields:
• PX - POSIX attributes.
• PN - Major and minor device numbers.
• SL - Symbolic link.
• NM - Alternative name.
• CL - Child location.
• PL - Parent location.
• RE - Relocation.
• TF - Time stamps.
Example File Systems
CD-ROM File Systems
• Joliet Extensions
– Is invented by Microsoft
– Were designed to allow Windows file systems to be copied to CD-
ROM and then restored, in precisely the same way that Rock
Ridge was designed for UNIX
– Joliet extension fields
• Long file names.
• Unicode character set.
• Directory nesting deeper than eight levels.
• Directory names with extensions
Example File Systems
MS-DOS File System
• To read a file, MS-DOS must first make an open system call
to specifies a path that is looked up component to
component until the final directory is located and read into
memory. It is then searched for the file to be opened
• Use a fixed-size 32 byte directory entry
• File names (8 + 3) characters
• Stores the file size as a 32 bit number (can be large as 2GB)
• Keeps track of file blocks via a FAT in main memory

Tanenbaum, Fig. 4-31.


Example File Systems
MS-DOS File System

Tanenbaum, Fig. 4-32.


Example File Systems
UNIX V7 File System
• Form of a tree starting at the root directory, with the
addition of links, forming a directed acyclic graph
• File names are up to 14 characters and can contain any
ASCII character except and NULL
• A UNIX directory entry contains one entry for each file in
that directory including file name (14 bytes) and the number
of i-node (2 bytes).
• Each entry is extremely simple because it uses the i-node
that can contain some attributes

Tanenbaum, Fig. 4-33.


Example File Systems
UNIX V7 File System

Tanenbaum, Fig. 4-34.


Summary
• Operating System Concepts
• Files
• Directories
• File System Implementation
• File System Management and Optimization
• Example File Systems

Q&A
Next Lecture
• I/O Hardware
• I/O Software
• Principles
• Layers
• Disk management
• Disk Arm Scheduling Algorithms
• Error Handling
• Stable Storage

You might also like