Lecture 09 10
Lecture 09 10
4
FILE ATTRIBUTES
Name – only information kept in human-readable form
Identifier – unique tag (number) identifies file within file system
Type – needed for systems that support different types
Location – pointer to file location on device (path)
Size – current file size
Protection – controls who can do reading, writing, executing
Time, date, and user identification – data for protection, security, and usage monitoring
Information about files are kept in the directory structure, which is maintained on the disk
Many variations, including extended file attributes such as file checksum
Information kept in the directory structure
5
FILE STRUCTURE
None - sequence of words, bytes
Simple record structure
Lines
Fixed length
Variable length
Complex Structures
Formatted document
Relocatable load file
Can simulate last two with first method by inserting appropriate control
characters
Who decides the structure:
Operating system
Program
6
FILE OPERATIONS
File is an abstract data type
Create
Write – at write pointer location
Read – at read pointer location
Reposition within file - seek
Delete – remove a file
Truncate - act of making something shorter or
quicker, especially by removing the end of it
Open(Fi) – search the directory structure on disk for
entry Fi, and move the content of entry to memory
Close (Fi) – move the content of entry Fi in memory
to directory structure on disk
7
OPEN FILES
Several pieces of data are needed to manage open files:
Open-file table - to tracks open files
File pointer: pointer to last read/write location, per process that has
the file open
File-open count - count the number of times a file is open – to
allow removal of data from open-file table when last processes closes
it
Disk location of the file: cache of data access information
Access rights: per-process access mode information 8
OPEN FILE LOCKING
Provided by some operating systems and file systems
Similar to reader-writer locks
Shared lock like reader lock – several processes can acquire concurrently
Exclusive lock like writer lock
Mandatory or advisory:
Mandatory – access is denied depending on locks held and requested
Advisory – processes can find status of locks and decide what to do
9
FILE TYPES – NAME,
EXTENSION
10
SEQUENTIAL-ACCESS
FILE
11
ACCESS METHODS
12
SIMULATION OF SEQUENTIAL
ACCESS ON DIRECT-ACCESS FILE
13
OTHER ACCESS
METHODS
Can be built on top of base methods
General involve creation of an index for the file
Keep index in memory for fast determination of location of data to be
operated on (consider UPC code plus record of data about that item)
If too large, index (in memory) of the index (on disk)
Exmp: IBM Indexed Sequential-access Method (ISAM)
Small master index, points to disk blocks of secondary index
File kept sorted on a defined key
All done by the OS
VMS operating system provides index and relative files as another
example (see next slide)
14
EXAMPLE OF INDEX AND
RELATIVE FILES
Directory
Files
F1 F2 F4
F3
Fn
16
DISK STRUCTURE
Disk can be subdivided into partitions
Disks or partitions can be RAID !=0 (backup) protected against failure
Disk or partition can be used raw – without a file system, or formatted
with a file system
Partitions also known as minidisks, slices
Entity containing file system known as a volume
Each volume containing file system also tracks that file system’s info in
device directory or volume table of contents
As well as general-purpose file systems there are many special-
purpose file systems, frequently all within the same operating
system or computer
17
A TYPICAL FILE-SYSTEM
ORGANIZATION
18
OPERATIONS
PERFORMED ON
DIRECTORY
19
ORGANIZATION OF THE
DIRECTORY
Efficiency – locating a file quickly (Must be effective)
Naming – convenient to users
Two users can have same name for different files
The same file can have several different names
Grouping – logical grouping of files by properties, (e.g., all Java
programs, all games, …)
Single Directory for all users
Naming problem and Grouping Problem
Separate Directory for each users
Can have the same file name for different user
Efficient searching
No grouping capability
20
21
22
23
24
Tree-Structured Directories
Advantages
•Commonly Use
•Efficient searching
•Grouping Capability
25
26
TREE-STRUCTURED
DIRECTORIES
27
28
FILE SYSTEM
MOUNTING
If multi-user system
User IDs identify users, allowing permissions and protections to be per-user
Group IDs allow users to be in groups, permitting group access rights
Owner of a file / directory
Group of a file / directory 31
FILE SHARING – REMOTE
FILE SYSTEMS
Uses networking to allow file system access between systems
Manually via programs like FTP
Automatically, seamlessly using distributed file systems (DFS)
Semi automatically via the world wide web (WWW)
Types of access
Read
Write
Execute
Append
Delete
List
33
ACCESS LISTS AND
GROUPS
Mode of access: read, write, execute
Three classes of users on Unix / Linux
RWX
a) owner access 7 111
RWX
b) group access 6 110
RWX
c) public access 1 001
34
A SAMPLE UNIX
DIRECTORY LISTING
35
DIRECTORY
IMPLEMENTATION
tree
Linear list of file names with pointer
to the data blocks
Simple to program
Time-consuming to execute
Linear search time
Could keep ordered alphabetically via
linked list or use B+
Hash Table – linear list with hash
data structure
Decreases directory search time
Collisions – situations where two file
names hash to the same location
Only good if entries are fixed size, or 36
use chained-overflow method
ALLOCATION METHODS
- CONTIGUOUS
An allocation method refers to how disk blocks are
allocated for files:
Contiguous allocation – each file occupies set of
contiguous blocks
Best performance in most cases
Simple – only starting location (block #) and length
(number of blocks) are required (based on frame
concept)
Problems include finding space for file, knowing file
size, external fragmentation, need for compaction
off-line (downtime) or on-line
37
CONTIGUOUS ALLOCATION
OF DISK SPACE
38
ALLOCATION METHODS -
LINKED
Linked allocation – each file a linked list of blocks
File ends at nil pointer
No external fragmentation
Each block contains pointer to next block
No compaction
Free space management system called when new block needed
Improve efficiency by clustering blocks into groups but increases internal
fragmentation
Reliability can be a problem
Locating a block can take many I/Os and disk seeks
40
FILE-ALLOCATION TABLE
41
ALLOCATION METHODS
-Indexed
INDEXED
allocation
Each file has its own index block(s) of pointers to its data blocks
Logical view
index table
42
EXAMPLE OF INDEXED
ALLOCATION
43
COMBINED SCHEME: UNIX UFS
(4K BYTES PER BLOCK, 32-BIT ADDRESSES)
44
PERFORMANCE
Best method depends on file access type
Contiguous great for sequential and random
Linked good for sequential, not random
1 block[i] free
bit[i] =
0 block[i] occupied
46
FREE-SPACE MANAGEMENT
(CONT.)
Bit map requires extra space
Example:
47
FREE-SPACE MANAGEMENT (CONT.)
Grouping
Modify linked list to store address of next n-1 free blocks in first free
block, plus a pointer to next block that contains free-block-pointers
(like this one)
Counting
Because space is frequently contiguously used and freed, with
contiguous-allocation allocation, extents, or clustering
Keep address of first free block and count of following free blocks
Free space list then has entries containing addresses and counts
48
LINKED FREE SPACE LIST
ON DISK
49
EFFICIENCY AND
PERFORMANCE
Efficiency dependent on:
Disk allocation and directory algorithms
Types of data kept in file’s directory entry
Pre-allocation or as-needed allocation of metadata
structures
Fixed-size or varying-size data structures
Performance
Keeping data and metadata close together
Buffer cache – separate section of main memory
for frequently used blocks
Synchronous writes sometimes requested by apps
or needed by OS
No buffering / caching – writes must hit disk before
acknowledgement
50
EFFICIENCY AND
PERFORMANCE CONT
Asynchronous writes more common, buffer-able, faster
Free-behind and read-ahead – techniques to optimize
sequential access
Reads frequently slower than writes
51
RECOVERY
52
LOG STRUCTURED
FILE SYSTEMS
Log structured (or journaling) file systems record each metadata update to
the file system as a transaction
All transactions are written to a log (similar like the logbook)
A transaction is considered committed once it is written to the log (sequentially)
Sometimes to a separate device or section of disk
However, the file system may not yet be updated
The transactions in the log are asynchronously written to the file system
structures
When the file system structures are modified, the transaction is removed from the
log
If the file system crashes, all remaining transactions in the log must still be
performed
Faster recovery from crash, removes chance of inconsistency of metadata 53
END OF CHAPTER 9 - 10
54