File Management
File Management
Presensi
3
Learning Objectives (continued)
• Comparisons of sequential and direct file access
• The security ramifications of access control
techniques and how they compare
• The role of data compression in file storage
4
The File Manager
5
Responsibilities of the File Manager
• Four tasks
– File storage tracking
– Policy implementation
• Determine where and how files are stored
• Efficiently use available storage space
• Provide efficient file access
– File allocation if user access cleared
• Record file use
– File deallocation
• File returned to
storage
• Communicate
file availability 6
Responsibilities of the File Manager
(continued)
• Policy determines:
– File storage location
– System and user access
• Uses device-independent commands
• Access to material
– Two factors
• Flexibility of access to information (Factor 1)
– Shared files
– Providing distributed access
– Allowing users to browse public directories
7
Responsibilities of the File Manager
(continued)
• Subsequent protection (Factor 2)
– Prevent system malfunctions
– Security checks
• Account numbers, passwords, lockwords
• File allocation
– Activate secondary storage device, load file into
memory, update records
• File deallocation
– Update file tables, rewrite file (if revised), notify
waiting processes of file availability
8
Definitions
• Field
– Group of related bytes
– Identified by user (name, type, size)
• Record
– Group of related fields
• File
– Group of related records
– Information used by specific application programs
• Report generation
– Flat file
• No connections to other files, no dimensionality
9
Definitions (continued)
• Databases
– Groups of related files
– Interconnected at various levels
• Give users flexibility of access to stored data
• Program files
– Contain instructions
• Data files
– Contain data
• Directories
– Listings of filenames and their attributes
1
0
Definitions (continued)
11
Interacting with the File Manager
• Commands
– Embedded in program
• OPEN, CLOSE, READ, WRITE, MODIFY
– Submitted interactively
• CREATE, DELETE, RENAME, COPY
• Device independent
– Physical location knowledge not needed
• Cylinder, surface, sector
– Device medium knowledge not needed
• Tape, magnetic disk, optical disc, flash storage
– Network knowledge not needed
12
Interacting with the File Manager
(continued)
13
Interacting with the File Manager
(continued)
• Logical commands
– Broken into lower-level signals
– Example: READ
• Move read/write heads to record cylinder
• Wait for rotational delay (sector containing record
passes under read/write head)
• Activate appropriate read/write head and read record
• Transfer record to main memory
• Send flag indicating free device for another request
• Performs error checking and correction
– No need for error-checking code in programs
14
Typical Volume Configuration
• Volume
– Secondary storage unit (removable, nonremovable)
– Multifile volume
• Contains many files
– Multivolume files
• Extremely large files spread across several volumes
• Volume name
– File manager manages
– Easily accessible
• Innermost part of CD, beginning of tape, first sector of
outermost track
15
Typical Volume Configuration
(continued)
16
Typical Volume Configuration
(continued)
• Master file directory (MFD)
• Stored immediately after volume descriptor
• Lists
– Names and characteristics of every file in volume
• File names (program files, data files, system files)
– Subdirectories
• If supported by file manager
– Remainder of volume
• Used for file storage
17
Typical Volume Configuration
(continued)
• Single directory per volume
– Supported by early operating systems
• Disadvantages
– Long search time for individual file
– Directory space filled before disk storage space filled
– Users cannot create subdirectories
– Users cannot safeguard their files
– Each program needs unique name
• Even those serving many users
18
About Subdirectories
19
About Subdirectories (continued)
• File managers today
– Users create own subdirectories (folders)
• Related files grouped together
– Implemented as upside-down tree
• Efficient system searching of individual directories
• May require several directories to reach file
20
About Subdirectories (continued)
21
About Subdirectories (continued)
• File descriptor
– Filename: ASCII code
– File type: organization and usage
• System dependent
– File size: for convenience
– File location
• First physical block identification
– Date and time of creation
– Owner
– Protection information: access restrictions
– Record size: fixed size, maximum size
22
File-Naming Conventions
• Filename components
– Relative filename and extension
• Complete filename (absolute filename)
– Includes all path information
• Relative filename
– Name without path information
– Appears in directory listings, folders
– Provides filename differentiation within directory
– Varies in length
• One to many characters
• Operating system specific
23
File-Naming Conventions (continued)
• Extensions
– Appended to relative filename
• Two to three characters
• Separated by period
• Identifies file type or contents
– Example
• BASIA_TUNE.MPG
– Unknown extension
• Requires user intervention
24
File-Naming Conventions (continued)
25
File-Naming Conventions (continued)
• Operating system specifics
– Windows
• Drive label and directory name, relative name, and
extension
– Network with Open/VMS Alpha
• Node, volume or storage device, directory, subdirectory,
relative name and extension, file version number
– UNIX/Linux
• Forward slash (root), first subdirectory, sub-
subdirectory, file’s relative name
26
File Organization
• Arrangement of records within files
• All files composed of records
• Modify command
– Request to access record within a file
27
Record Format
• Fixed-length records
– Direct access: easy
– Record size critical
– Ideal for data files
• Variable-length records
– Direct access: difficult
– No empty storage space and no character truncation
– File descriptor stores record format
– Used with files accessed sequentially
• Text files, program files
– Used with files using index to access records
28
Record Format (continued)
29
Physical File Organization
• Describes:
– Record arrangement
– Medium characteristics
• Magnetic disks file organization
– Sequential, direct, indexed sequential
• File organization scheme selection considerations
– Volatility of data
– Activity of file
– Size of file
– Response time
30
Physical File Organization (continued)
31
Physical File Organization (continued)
32
Physical File Organization (continued)
• Direct record organization (continued)
– Advantages
• Fast record access
• Sequential access if starting at first relative address and
incrementing to next record
• Updated more quickly than sequential files
• No preservation of records order
• Adding, deleting records is quick
– Disadvantages
• Hashing algorithm collision
• Similar keys
33
Physical File Organization (continued)
34
Physical File Organization (continued)
35
Physical Storage Allocation
• File manager works with files
– As whole units
– As logical units or records
• Within file
– Records must have same format
– Record length may vary
• Records subdivided into fields
• Application programs manage record structure
• File storage
– Refers to record storage
36
Physical Storage Allocation
(continued)
37
Contiguous Storage
38
Noncontiguous Storage
• Files use any available disk storage space
• File records stored in contiguous manner
– If enough empty space
• Remaining file records and additions
– Stored in other disk sections (extents)
– Extents
• Linked together with pointers
• Physical size determined by operating system
• Usually 256 bytes
39
Noncontiguous Storage (continued)
• File extents linked in two ways
– Storage level
• Each extent points to next one in sequence
• Directory entry
• Filename, storage location of first extent, location of last
extent, total number of extents (not counting first)
– Directory level
• Each extent listed with physical address, size, pointer to
next extent
• Null pointer indicates last one
40
Noncontiguous Storage (continued)
• Advantage
– Eliminates external storage fragmentation
– Eliminates need for compaction
• Disadvantage
– No direct access support
• Cannot determine specific record’s exact location
41
Noncontiguous Storage (continued)
42
43
Indexed Storage
44
45
Access Methods
• Dictated by a file organization
• Most flexibility: indexed sequential files
• Least flexible: sequential files
• Sequential file organization
– Supports only sequential access
• Records: fixed or variable length
• Access next sequential record
– Use address of last byte read
• Current byte address (CBA)
– Updated every time record
accessed
46
Access Methods (continued)
47
Sequential Access
• Update CBA
• Fixed-length records
– Increment CB
• CBA = CBA + RL
• Variable-length records
– Add length of record (RLK) plus numbers of bytes
used to hold record to CBA
• CBA = CBA + N + RLk
48
Direct Access
• Fixed-length records (RN: desired record number)
– CBA = (RN – 1) * RL
• Variable-length records
– Virtually impossible
• Address of desired record cannot be easily computed
– Requires sequential search through records
– Keep table of record numbers and CBAs
• Indexed sequential file
– Accessed sequentially or directly
– Index file searched for pointer to data block
49
Levels in a File Management System
50
Levels in a File Management System
(continued)
• Level implementation
– Structured and modular programming techniques
– Hierarchical
• Highest module passes information to lower module
• Modules further subdivided
– More specific tasks
• Uses information of basic file system
– Logical file system transforms record number to byte
address
51
Levels in a File Management System
(continued)
• Verification at every level
– Directory level
• File system checks if requested file exists
– Access control verification module
• Determines whether access allowed
– Logical file system
• Checks if requested byte address within file limits
– Device interface module
• Checks if storage device exists
52
Access Control Verification Module
• File sharing
– Data files, user-owned program files, system files
– Advantages
• Save space, synchronized updates, resource efficiency
– Disadvantage
• Need to protect file integrity
– Five possible file actions
• READ only, WRITE only, EXECUTE only, DELETE
only, combination
– Four methods
53
Access Control Matrix
• Advantages
– Easy to implement
– Works well in system with few files, users
54
Access Control Matrix (continued)
• Disadvantages
– As files and user increase, matrix increases
• Possibly beyond main memory capacity
– Wasted space: due to null entries
55
Access Control Lists
• Modification of access control matrix technique
56
Access Control Lists (continued)
• Contains user names granted file access
– User denied access grouped under “WORLD”
• Shorten list by categorizing users
– SYSTEM
• Personnel with unlimited access to all files
– OWNER
• Absolute control over all files created in own account
– GROUP
• All users belonging to appropriate group have access
– WORLD
• All other users in system
57
Capability Lists
• Lists every user and files each has access to
• Can control access to devices as well as to files
• Most common
58
Lockwords
• Lockword
– Similar to password
– Protects single file
• Advantage
– Requires smallest storage amount for file protection
• Disadvantages
– Guessable, passed on to unauthorized users
– Does not control type of access
• Anyone who knows lockword can read, write, execute,
delete file
59
Data Compression
60
Text Compression
• Records with repeated characters
– Repeated characters are replaced with a code
• Repeated terms
– Compressed using symbols to represent most
commonly used words
– University student database common words
• Student, course, grade, department each be
represented with single character
• Front-end compression
– Entry takes given number of characters from previous
entry that they have in common
61
Other Compression Schemes
• Large files
– Video and music
• ISO MPEG standards
– Photographs
• ISO
– International Organization for Standardization
62
Summary
• File manager
– Controls every file and processes user commands
– Manages access control procedures
• Maintain file integrity and security
– File organizations
• Sequential, direct, indexed sequential
– Physical storage allocation schemes
• Contiguous, noncontiguous, indexed
– Record types
• Fixed-length versus variable-length
records
– Four access methods
63