File MGMT
File MGMT
I/O Systems
Mass Storage Systems
File System Management
Outline
2
10.1 File Concept
3
10.1.1 File Attributes
❑ Name
❑ symbolic file-name, only information in human-readable form
❑ Type -
❑ for systems that support multiple types
❑ Location -
❑ pointer to a device and to file location on device
❑ Size -
❑ current file size, maximal possible size
❑ Protection -
❑ controls who can read, write, execute
❑ Time, Date and user identification
❑ data for protection, security and usage monitoring
❑ Information about files are kept in the directory structure,
maintained on disk
4
10.1.2 File Operations
❑ A file is an abstract data type. It can be defined by
operations:
■ Create a file
■ Write a file
■ Read a file
■ Reposition within file - file seek
■ Delete a file
■ Truncate a file
■ Open(Fi)
❑ search the directory structure on disk for entry Fi, and move the
content of entry to memory.
■ Close(Fi)
❑ move the content of entry Fi in memory to directory structure on
disk.
5
10.1.3 File types -
name.extension
6
10.1.4 File Structure
❑ None - sequence of words/bytes
❑ Simple record structure
❑ Lines
❑ Fixed Length
❑ Variable Length
❑ Complex Structures
❑ Formatted document
❑ Re-locatable Load File
❑ Can simulate last two with first method by
inserting appropriate control characters
❑ Who decides
❑ Operating System
❑ Program
7
Directory Structure
8
Information in a Device
Directory
❑ File Name
❑ File Type
❑ Address or Location
❑ Current Length
❑ Maximum Length
❑ Date created, Date last accessed (for archival),
Date last updated (for dump)
❑ Owner ID (who pays), Protection information
■ Also on a per file, per process basis
❑ Current position - read/write position
❑ usage count
9
Operations Performed on
Directory
❑ Search for a file
❑ Create a file
❑ Delete a file
❑ List a directory
❑ Rename a file
❑ Traverse the filesystem
10
Logical Directory
Organization -- Goals
■Efficiency - locating a file quickly
■Naming - convenient to users
■ Two users can have the same name for different files.
■ The same file can have several different names.
■Grouping
■ Logical grouping of files by properties (e.g. all Pascal
programs, all games…)
11
Single Level Directory
■ A single directory for all users
■ Naming Problem and Grouping Problem
❑ As the number of files increases, difficult to remember
unique names
❑ As the number of users increase, users must have unique
names.
12
Two Level Directory
13
Two Level Directory
14
Tree structured Directories
15
Tree Structured Directories
16
Tree Structured Directories
❑ Absolute or relative path name
❑ Absolute from root
❑ Relative paths from current working directory pointer.
❑ Creating a new file is done in current directory
❑ Creating a new subdirectory is done in current
directory, e.g. mkdir <dir-name>
❑ Delete a file , e.g. rm file-name
❑ Deletion of directory
■ Option 1 : Only delete if directory is empty
■ Option 2: delete all files and subdirectories under
directory
17
Acyclic Graph Directories
18
Acyclic Graph Directories
19
Acyclic Graph Directories
❑ Naming : File may have multiple absolute path names
■ Two different names for the same file
❑ Traversal
❑ ensure that shared data structures are traversed only once.
❑ Deletion
■ Removing file when someone deletes it may leave dangling
pointers.
■ Preserve file until all references to it are deleted
❑ Keep a list of all references to a file or
❑ Keep a count of the number of references - reference count.
❑ When count = 0, file can be deleted.
20
General Graph Directories
21
General Graph Directories
(cont.)
❑ How do we guarantee no cycles in a tree
structured directory?
❑ Allow only links to file not subdirectories.
❑ Every time a new link is added use a cycle detection
algorithm to determine whether it is ok.
❑ If links to directories are allowed, we have a
simple graph structure
❑ Need to ensure that components are not traversed twice
both for correctness and for performance, e.g. search can
be non-terminating.
❑ File Deletion - reference count can be non-zero
❑ Need garbage collection mechanism to determine if file can
be deleted.
22
Access Methods
■Sequential Access
read next
write next
reset
no read after last write (rewrite)
■Direct Access ( n = relative block number)
read n
write n
position to n
read next
write next
rewrite n
23
Sequential File Organization
24
Indexed Sequential or
Indexed File Organization
25
Direct Access File
Organization
26
Protection
27
Access lists and groups
❑ Associate each file/directory with access list
■ Problem - length of access list..
❑ Solution - condensed version of list
■ Mode of access: read, write, execute
■ Three classes of users
❑ owner access - user who created the file
❑ groups access - set of users who are sharing the file and need
similar access
❑ public access - all other users
■ In UNIX, 3 fields of length 3 bits are used.
❑ Fields are user, group, others(u,g,o),
❑ Bits are read, write, execute (r,w,x).
❑ E.g. chmod go+rw file , chmod 761 game
28
File-System Implementation
29
File-System Structure
■File Structure
■ Logical Storage Unit with collection of related
information
❑ File System resides on secondary storage (disks).
■ To improve I/O efficiency, I/O transfers between memory
and disk are performed in blocks.
❑ Read/Write/Modify/Access each block on disk.
■File system organized into layers.
■File control block - storage structure
consisting of information about a file.
30
File System Mounting
31
Allocation of Disk Space
32
Contiguous Allocation
❑ Each file occupies a set of contiguous blocks on the disk.
❑ Simple - only starting location (block #) and length (number of
blocks) are required.
❑ Suits sequential or direct access.
❑ Fast (very little head movement) and easy to recover in the event
of system crash.
■ Problems
❑ Wasteful of space (dynamic storage-allocation problem). Use first
fit or best fit. Leads to external fragmentation on disk.
❑ Files cannot grow - expanding file requires copying
❑ Users tend to overestimate space - internal fragmentation.
❑ Mapping from logical to physical - <Q,R>
❑ Block to be accessed = Q + starting address
❑ Displacement into block = R
33
Contiguous Allocation
34
Linked Allocation
35
Linked Allocation
36
Linked Allocation
❑ Simple - need only starting address.
❑ Free-space management system - space efficient.
■ Can grow in middle and at ends. No estimation of size
necessary.
❑ Suited for sequential access but not random
access.
❑ Directory Table maps files into head of list for a
file.
❑ Mapping - <Q, R>
❑ Block to be accessed is the Qth block in the linked chain of
blocks representing the file.
❑ Displacement into block = R + 1
37
Linked Allocation (cont.)
38
Indexed Allocation
Index table
39
Indexed Allocation
40
Indexed Allocation (cont.)
41
Indexed Allocation - Mapping
42
Indexed File - Linked Scheme
link
link
43
Indexed Allocation -
Multilevel index
2nd level
Index
Index block
link
link
44
Combined Scheme: UNIX (4K
bytes per block)
mod dat
e
owners a
timestamp dat
s Size a
block
coun dat
t a
Direct blocks
dat
a dat
a dat
a dat
Single dat a
indirect a dat
double
indirect a
Triple indirect dat
a
45
Free Space Management
46
Free Space Management
❑ Linked list (free list)
❑ Keep a linked list of free blocks
❑ Cannot get contiguous space easily, not very efficient because
linked list needs traversal.
❑ No waste of space
❑ Linked list of indices - Grouping
❑ Keep a linked list of index blocks. Each index block contains
addresses of free blocks and a pointer to the next index block.
❑ Can find a large number of free blocks contiguously.
❑ Counting
❑ Linked list of contiguous blocks that are free
❑ Free list node contains pointer and number of free blocks starting
from that address.
47
Free Space Management
■Need to protect
■ pointer to free list
■ Bit map
❑ Must be kept on disk
❑ Copy in memory and disk may differ.
❑ Cannot allow for block[i] to have a situation where bit[i] = 1
in memory and bit[i] = 0 on disk
■ Solution
❑ Set bit[i] = 1 in disk
❑ Allocate block[i]
❑ Set bit[i] = 1 in memory.
48
Directory Implementation
49
Efficiency and Performance
❑ Efficiency dependent on:
❑ disk allocation and directory algorithms
❑ types of data kept in the files directory entry
❑ Dynamic allocation of kernel structures
❑ Performance improved by:
■ On-board cache - for disk controllers
■ Disk Cache - separate section of main memory for frequently
used blocks. Block replacement mechanisms
❑ LRU
❑ Free-behind - removes block from buffer as soon as next block is
requested.
❑ Read-ahead - request block and several subsequent blocks are
read and cached.
■ Improve PC performance by dedicating section of memory as
virtual disk or RAM disk.
50
Recovery
51
End of File Systems
Concepts
52