0% found this document useful (0 votes)
9 views

Lecture 09 10

The document provides an overview of file systems, including their functions, design trade-offs, and implementation details. It covers various aspects such as file attributes, operations, directory structures, and allocation methods, along with protection mechanisms for file access. Additionally, it discusses remote file systems and performance considerations related to file management.

Uploaded by

Amanda James
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Lecture 09 10

The document provides an overview of file systems, including their functions, design trade-offs, and implementation details. It covers various aspects such as file attributes, operations, directory structures, and allocation methods, along with protection mechanisms for file access. Additionally, it discusses remote file systems and performance considerations related to file management.

Uploaded by

Amanda James
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 54

LECTURES 9 & 10: FILE

SYSTEMS & FILE


SYSTEM
IMPLEMENTATION
OVERVIEW
File Systems
File Systems
Access Methods
Implementation
Disk and Directory Structure
File-System Mounting File-System Structure
File Sharing File-System Implementation
Protection Directory Implementation
Allocation Methods
Free-Space Management
Efficiency and Performance
2
OBJECTIVES
To explain the function of file systems
To describe the interfaces to file systems
To discuss file-system design tradeoffs, including access
methods, file sharing, file locking, and directory structures
To explore file-system protection
To describe the details of implementing local file systems
and directory structures
To describe the implementation of remote file systems
To discuss block allocation and free-block algorithms and
trade-offs
3
FILE CONCEPT
Contiguous logical address space
Types:
 Data
 numeric
 character
 binary
 Program
Contents defined by file’s creator (user)
 Many types
 Consider text file, source file, executable file

4
FILE ATTRIBUTES
Name – only information kept in human-readable form
Identifier – unique tag (number) identifies file within file system
Type – needed for systems that support different types
Location – pointer to file location on device (path)
Size – current file size
Protection – controls who can do reading, writing, executing
Time, date, and user identification – data for protection, security, and usage monitoring

Information about files are kept in the directory structure, which is maintained on the disk
Many variations, including extended file attributes such as file checksum
Information kept in the directory structure

5
FILE STRUCTURE
None - sequence of words, bytes
Simple record structure
 Lines
 Fixed length
 Variable length
Complex Structures
 Formatted document
 Relocatable load file
Can simulate last two with first method by inserting appropriate control
characters
Who decides the structure:
 Operating system
 Program
6
FILE OPERATIONS
File is an abstract data type
Create
Write – at write pointer location
Read – at read pointer location
Reposition within file - seek
Delete – remove a file
Truncate - act of making something shorter or
quicker, especially by removing the end of it
Open(Fi) – search the directory structure on disk for
entry Fi, and move the content of entry to memory
Close (Fi) – move the content of entry Fi in memory
to directory structure on disk
7
OPEN FILES
Several pieces of data are needed to manage open files:
 Open-file table - to tracks open files
 File pointer: pointer to last read/write location, per process that has
the file open
 File-open count - count the number of times a file is open – to
allow removal of data from open-file table when last processes closes
it
 Disk location of the file: cache of data access information
 Access rights: per-process access mode information 8
OPEN FILE LOCKING
Provided by some operating systems and file systems
 Similar to reader-writer locks
 Shared lock like reader lock – several processes can acquire concurrently
 Exclusive lock like writer lock

Mediates access to a file

Mandatory or advisory:
 Mandatory – access is denied depending on locks held and requested
 Advisory – processes can find status of locks and decide what to do

9
FILE TYPES – NAME,
EXTENSION

10
SEQUENTIAL-ACCESS
FILE

11
ACCESS METHODS

12
SIMULATION OF SEQUENTIAL
ACCESS ON DIRECT-ACCESS FILE

13
OTHER ACCESS
METHODS
Can be built on top of base methods
General involve creation of an index for the file
Keep index in memory for fast determination of location of data to be
operated on (consider UPC code plus record of data about that item)
If too large, index (in memory) of the index (on disk)
Exmp: IBM Indexed Sequential-access Method (ISAM)
 Small master index, points to disk blocks of secondary index
 File kept sorted on a defined key
 All done by the OS
VMS operating system provides index and relative files as another
example (see next slide)

14
EXAMPLE OF INDEX AND
RELATIVE FILES

Logical record number = memory location 15


DIRECTORY
STRUCTURE
A collection of nodes containing information about all files

Directory

Files
F1 F2 F4
F3
Fn

Both the directory structure and the files reside on disk

16
DISK STRUCTURE
Disk can be subdivided into partitions
Disks or partitions can be RAID !=0 (backup) protected against failure
Disk or partition can be used raw – without a file system, or formatted
with a file system
Partitions also known as minidisks, slices
Entity containing file system known as a volume
Each volume containing file system also tracks that file system’s info in
device directory or volume table of contents
As well as general-purpose file systems there are many special-
purpose file systems, frequently all within the same operating
system or computer

17
A TYPICAL FILE-SYSTEM
ORGANIZATION

18
OPERATIONS
PERFORMED ON
DIRECTORY

19
ORGANIZATION OF THE
DIRECTORY
Efficiency – locating a file quickly (Must be effective)
Naming – convenient to users
 Two users can have same name for different files
 The same file can have several different names
Grouping – logical grouping of files by properties, (e.g., all Java
programs, all games, …)
Single Directory for all users
 Naming problem and Grouping Problem
Separate Directory for each users
 Can have the same file name for different user
 Efficient searching
 No grouping capability

20
21
22
23
24
Tree-Structured Directories
Advantages
•Commonly Use
•Efficient searching
•Grouping Capability

Each user has a Current directory (working


directory)
cd /spell/mail/prog
type list

25
26
TREE-STRUCTURED
DIRECTORIES

27
28
FILE SYSTEM
MOUNTING

(a) A file system must be mounted before it can be


accessed
(b) A unmounted file system is mounted at a
mount point
29
30
FILE SHARING
Sharing of files on multi-user systems is desirable
Sharing may be done through a protection scheme
On distributed systems, files may be shared across a network
Network File System (NFS) is a common distributed file-sharing method
 Remote file systems add new failure modes, due to network failure, will affect server
failure
 Recovery from failure can involve state information about status of each remote
request
 Stateless protocols such as NFS v3 include all information in each request, allowing
easy recovery but less security

If multi-user system
 User IDs identify users, allowing permissions and protections to be per-user
Group IDs allow users to be in groups, permitting group access rights
 Owner of a file / directory
 Group of a file / directory 31
FILE SHARING – REMOTE
FILE SYSTEMS
Uses networking to allow file system access between systems
 Manually via programs like FTP
 Automatically, seamlessly using distributed file systems (DFS)
 Semi automatically via the world wide web (WWW)

Client-server model allows clients to mount remote file systems from


servers
 Server can serve multiple clients
 Client and user-on-client identification is insecure or complicated
 NFS is standard UNIX client-server file sharing protocol
 CIFS is standard Windows protocol
 Standard operating system file calls are translated into remote calls

Distributed Information Systems (distributed naming services) such as


LDAP, DNS, NIS implement unified access to information needed for
remote computing. 32
PROTECTION
File owner/creator should be able to control:
 what can be done
 By whom

Types of access
 Read
 Write
 Execute
 Append
 Delete
 List
33
ACCESS LISTS AND
GROUPS
Mode of access: read, write, execute
Three classes of users on Unix / Linux
RWX
a) owner access 7  111
RWX
b) group access 6  110
RWX
c) public access 1  001

Ask manager to create a group (unique name), say G,


and add some users to the group.
For a particular file (say game) or subdirectory, define
an appropriate access.

owner group public

chmod 761 game

Change a group ownership of a file or directory:


chgrp G game

34
A SAMPLE UNIX
DIRECTORY LISTING

35
DIRECTORY
IMPLEMENTATION
tree
Linear list of file names with pointer
to the data blocks
 Simple to program
 Time-consuming to execute
 Linear search time
 Could keep ordered alphabetically via
linked list or use B+
Hash Table – linear list with hash
data structure
 Decreases directory search time
 Collisions – situations where two file
names hash to the same location
 Only good if entries are fixed size, or 36
use chained-overflow method
ALLOCATION METHODS
- CONTIGUOUS
An allocation method refers to how disk blocks are
allocated for files:
Contiguous allocation – each file occupies set of
contiguous blocks
Best performance in most cases
Simple – only starting location (block #) and length
(number of blocks) are required (based on frame
concept)
Problems include finding space for file, knowing file
size, external fragmentation, need for compaction
off-line (downtime) or on-line
37
CONTIGUOUS ALLOCATION
OF DISK SPACE

38
ALLOCATION METHODS -
LINKED
Linked allocation – each file a linked list of blocks
 File ends at nil pointer
 No external fragmentation
 Each block contains pointer to next block
 No compaction
 Free space management system called when new block needed
 Improve efficiency by clustering blocks into groups but increases internal
fragmentation
 Reliability can be a problem
 Locating a block can take many I/Os and disk seeks

FAT (File Allocation Table) variation


 Beginning of volume has table, indexed by block number
 Much like a linked list, but faster on disk and cacheable
 New block allocation simple
39
LINKED ALLOCATION
 Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk

40
FILE-ALLOCATION TABLE

41
ALLOCATION METHODS
-Indexed
INDEXED
allocation
 Each file has its own index block(s) of pointers to its data blocks

Logical view

index table

42
EXAMPLE OF INDEXED
ALLOCATION

43
COMBINED SCHEME: UNIX UFS
(4K BYTES PER BLOCK, 32-BIT ADDRESSES)

Note: More index blocks


than can be addressed
with 32-bit file pointer

44
PERFORMANCE
Best method depends on file access type
 Contiguous great for sequential and random
 Linked good for sequential, not random

Declare access type at creation -> select either contiguous or linked


Indexed more complex
 Single block access could require 2 index block reads then data block read
 Clustering can help improve throughput, reduce CPU overhead
Adding instructions to the execution path to save one disk I/O is reasonable
 Intel Core i7 Extreme Edition 990x (2011) at 3.46Ghz = 159,000 MIPS
 http://en.wikipedia.org/wiki/Instructions_per_second
 Typical disk drive at 250 I/Os per second
 159,000 MIPS / 250 = 630 million instructions during one disk I/O
 Fast SSD drives provide 60,000 IOPS
 159,000 MIPS / 60,000 = 2.65 millions instructions during one disk I/O
45
FREE-SPACE
MANAGEMENT
File system maintains free-space list to track available blocks/clusters
 (Using term “block” for simplicity)

Bit vector or bit map (n blocks)


0 1 2 n-1

1  block[i] free


bit[i] =
0  block[i] occupied

Block number calculation


(number of bits per word) *
(number of 0-value words) +
offset of first 1 bit
CPUs have instructions to return offset within word of first “1” bit

46
FREE-SPACE MANAGEMENT
(CONT.)
Bit map requires extra space
 Example:

block size = 4KB = 212 bits


disk size = 240 bits (1 terabyte)
n = 240/212 = 228 bits (or 256 MB)
if clusters of 4 blocks -> 256/4 = 64MB of memory
Easy to get contiguous files
Linked list (free list)
 Cannot get contiguous space easily
 No waste of space
 No need to traverse the entire list (if # free blocks recorded)

47
FREE-SPACE MANAGEMENT (CONT.)
Grouping
 Modify linked list to store address of next n-1 free blocks in first free
block, plus a pointer to next block that contains free-block-pointers
(like this one)
Counting
 Because space is frequently contiguously used and freed, with
contiguous-allocation allocation, extents, or clustering
 Keep address of first free block and count of following free blocks
 Free space list then has entries containing addresses and counts

48
LINKED FREE SPACE LIST
ON DISK

49
EFFICIENCY AND
PERFORMANCE
Efficiency dependent on:
 Disk allocation and directory algorithms
 Types of data kept in file’s directory entry
 Pre-allocation or as-needed allocation of metadata
structures
 Fixed-size or varying-size data structures
Performance
 Keeping data and metadata close together
 Buffer cache – separate section of main memory
for frequently used blocks
 Synchronous writes sometimes requested by apps
or needed by OS
 No buffering / caching – writes must hit disk before
acknowledgement

50
EFFICIENCY AND
PERFORMANCE CONT
 Asynchronous writes more common, buffer-able, faster
 Free-behind and read-ahead – techniques to optimize
sequential access
 Reads frequently slower than writes

51
RECOVERY

Consistency checking – compares data in directory structure


with data blocks on disk, and tries to fix inconsistencies
 Can be slow and sometimes fails

Use system programs to back up data from disk to another


storage device (magnetic tape, other magnetic disk, optical)

Recover lost file or disk by restoring data from backup

52
LOG STRUCTURED
FILE SYSTEMS
Log structured (or journaling) file systems record each metadata update to
the file system as a transaction
All transactions are written to a log (similar like the logbook)
 A transaction is considered committed once it is written to the log (sequentially)
 Sometimes to a separate device or section of disk
 However, the file system may not yet be updated

The transactions in the log are asynchronously written to the file system
structures
 When the file system structures are modified, the transaction is removed from the
log
If the file system crashes, all remaining transactions in the log must still be
performed
Faster recovery from crash, removes chance of inconsistency of metadata 53
END OF CHAPTER 9 - 10

54

You might also like