0% found this document useful (0 votes)
148 views

Chapter Eight: File Management

The document discusses file management concepts including: 1) The file manager controls file organization, storage, and access. It tracks file locations, allocates storage, and enforces access controls. 2) Files can be organized sequentially, directly, or indexed sequentially and stored contiguously or non-contiguously. Records within files can be fixed-length or variable-length. 3) The file manager interacts with users via commands to perform functions like opening, reading, writing and deleting files while abstracting physical storage details.

Uploaded by

bombertest1
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
148 views

Chapter Eight: File Management

The document discusses file management concepts including: 1) The file manager controls file organization, storage, and access. It tracks file locations, allocates storage, and enforces access controls. 2) Files can be organized sequentially, directly, or indexed sequentially and stored contiguously or non-contiguously. Records within files can be fixed-length or variable-length. 3) The file manager interacts with users via commands to perform functions like opening, reading, writing and deleting files while abstracting physical storage details.

Uploaded by

bombertest1
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 53

Chapter Eight : File Management

• The File Manager Fixed Length Contiguous


• Interacting With File Manager Records Storage
• File Organization
Non-contiguous
• Physical Storage Allocation
Storage
• Data Compression
Variable Length
• Access Methods Records
• Levels in File Management Indexed
System Storage
• Access Control Verification
Module
Sequential or Direct File Access

Understanding Operati 1
ng Systems
The File Manager

• File Manager controls every file in system which is a


complex job.

• Efficiency depends on:


– how system’s files are organized (sequential, direct, or
indexed sequential).
– how they’re stored (contiguously, noncontiguously, or
indexed).
– how each file’s records are structured (fixed-length or
variable-length).
– how access to these files is controlled .
Understanding Operati 2
ng Systems
Responsibilities of File Manager

1. Track where each file is stored.


2. Determine where and how files will be stored.
– Efficiently use available storage space.
– Provide efficient access to files.
3. Allocate each file when a user has been cleared for access
to it, then record its use.
4. Deallocate file when it is returned to storage.
– Communicate its availability to others waiting for it.

Understanding Operati 3
ng Systems
Important Definitions

• Field -- group of related bytes that can be identified by user


with name, type, and size.

• Record -- group of related fields.

• File (flat file) -- group of related records that contains info


used by specific application programs to generate reports.

• Database -- groups of related files that are interconnected at


various levels to give flexible access to users.
– Appears to File Manager to be a type of file.

Understanding Operati 4
ng Systems
Definitions - 2

• Program files contain instructions.


• Data files contain data.
• Directories -- listings of file names and their attributes.

• Every program and data file accessed by computer system,


and every piece of computer software, is treated as a file.

• File Manager treats all files exactly same way as far as


storage is concerned.

Understanding Operati 5
ng Systems
Interacting With File Manager

• Users communicates with File Manager via specific


commands that may be either embedded in user’s program
or submitted interactively by user.

• Embedded commands:
– OPEN & CLOSE pertain to availability of file for
program invoking it.
– READ & WRITE are I/O commands.
– MODIFY – specialized WRITE command for existing
data files that allows for appending/rewriting records.
Understanding Operati 6
ng Systems
Interactive Commands

• CREATE & DELETE -- deal with system’s knowledge of


file.
• SAVE -- first time used, a file is actually created.
• OPEN NEW -- within a program indicates file must be
created.
• OPEN…FOR OUTPUT -- creates file by making entry for
it in directory & finding space for it in secondary storage.
• RENAME -- allows users to change name of existing file.
• COPY – allows user to make duplicate copies of existing
files.
Understanding Operati 7
ng Systems
Commands Are Device-Independent

• Interface commands designed to be as simple as possible


to use.
– Lack detailed instructions to run device where file is stored.
– Device independent.
• To access a file, user doesn’t need to know its exact
physical location on disk pack or storage medium.
• Each logical command broken down into sequence of low-
level signals that
– Trigger step-by-step actions performed by device.
– Supervise progress of operation by testing device’s status.

Understanding Operati 8
ng Systems
Typical Volume Configuration

• Each secondary storage unit (removable or non-removable)


is considered a volume.
– Each volume can contain several files called multifile
volumes.
– Some files are extremely large and are contained in
several volumes called multivolume files.

• Generally, each volume in system is given name.


– File Manager writes name & other descriptive info on
easy-to-access place on each unit.
Understanding Operati 9
ng Systems
Master File Directory (MFD)

• MFD stored immediately after volume descriptor


– Lists names & characteristics of every file contained in volume.
– File names refer to program files, data files, and/or system files.
– Subdirectories, if supported.
– Remainder of volume is used for file storage.

• Early OS supported only a single directory per volume.


– Created by File Manager.
– Contains names of files, usually organized in alphabetical, spatial,
or chronological order.
– Simple to implement and maintain.
– Some major disadvantages

Understanding Operati 10
ng Systems
Volume Descriptor

 
Creation Date Date when volume was created

 
Pointer to Directory Area Indicates first sector where directory
is stored

 
Pointer to File Area Indicates first sector where file is
stored

 
File System Code Used to detect volumes with incorrect
formats

 
Volume News User-allocated name

Understanding Operati 11
ng Systems
Some Major Disadvantages of Single
Directory Per Volume
1. Takes long time to search for an individual file, especially
if MFD was organized in an arbitrary order.
2. If user has many small files stored in volume, directory
space fills before disk storage space fills. User told “disk
full” when only directory full.
3. Users can’t create subdirectories to group related files.
4. Multiple users can’t safeguard files from other users
browsing file lists ‘cause entire directory listed on request.
5. Each program in entire directory needs unique name.
• E.g., Only 1 person using directory can name program
PROG1.
Understanding Operati 12
ng Systems
About Subdirectories

• Semi-sophisticated File Managers create MFD for each


volume with entries for files & subdirectories.
• Subdirectory created when user opens account to access
computer.
– MFD entry flagged to indicate subdirectory with unique
properties.
• Improvement from single directory scheme.
• Still can’t group files in a logical order to improve
accessibility & efficiency of system.

Understanding Operati 13
ng Systems
Subdirectories Can Be Implemented As an
Upside-down Tree
• Today’s File Managers allow users to create subdirectories
so related files are grouped together.
– Extension of previous two-level directory structure.
• Tree structures allow system to efficiently search
individual directories due to fewer entries in each.
• Path to requested file may lead through several directories.
• When user wants to access specific file, file name is sent to
File Manager. File Manager searches MFD for user's
directory. Then searches user's directory & any
subdirectories for requested file & location.

Understanding Operati 14
ng Systems
File Descriptor

Each file entry in every directory contains info describing file:


1. File name—usually represented in ASCII code.
2. File type—organization and usage that are dependent on system (e.g.,
Files and directories).
3. File size—size is kept here for convenience.
4. File location—identification of first physical block (or all blocks)
where file is stored.
5. Date and time of creation.
6. Owner.
7. Protection information—access restrictions based on who is allowed to
access file and what type of access is allowed.
8. Record size —its fixed size or its maximum size, depending on type of
record

Understanding Operati 15
ng Systems
File Names

• Absolute file name (complete file name) – long name that


includes all path info.
• Relative file name – short name seen in directory listings.
– Selected by user when file is created.
– E.g., ACCOUNT ADDRESSES, TAXES 2001, or AUTOEXEC.
• Extension – 2-3 character name used to identify type of
file or its contents.
– Separated from relative name by a period.
– E.g., CPP, BAS, BAT, COB, & EXE signal to system to use
specific compiler or program to run these files.
– E.g., TXT, DOC, OUT, MIC, & KEY created by applications or
by users for own identification.

Understanding Operati 16
ng Systems
File Naming Conventions

• Can vary in length from 1 or more characters.


• Can include letters of alphabet & digits.
• Every OS has specific rules that affect length of relative
name & types of characters allowed.
– E.g., MS-DOS allows 1-8 alphanumeric character
names without spaces.
– More modern OS allow names with dozens of
characters including spaces.
• Try to select descriptive relative names that readily identify
file contents/purpose of file.
Understanding Operati 17
ng Systems
Base and Current Directories Used by File
Manager to Locate Files
• File Manager selects base directory for user when
interactive session begins.
– All file operations requested by that user start here.
• Then, user selects subdirectory (current directory or
working directory).
– Thereafter, files presumed to be located in current directory.
• Whenever file accessed, user types in relative name & File
Manager adds proper prefix.
• As long as users refer to files in working directory, can
access them without entering complete name.

Understanding Operati 18
ng Systems
File Organization : Record Format

1. Fixed-length records – easiest to access directly.


– Most common type & ideal for data files.
– Record size critical (too small – truncation; too large – wastes
space).

2. Variable-length records -- difficult to access directly because hard to


calculate exactly where record is located.
– Don’t leave empty storage space & don’t truncate any characters.
– Frequently used in files accessed sequentially (e.g,. text files,
program files) or files using index to access records.
– File descriptor stores record format, how it’s blocked, & other
related info.

Understanding Operati 19
ng Systems
Physical File Organization

• Concerned with how records are arranged &


characteristics of medium used to store it.

• On magnetic disks, files can be organized as:


1. Sequential
2. Direct
3. Indexed sequential.

Understanding Operati 20
ng Systems
Characteristics Considered When
Selecting File Organization
• Volatility of data—frequency with which additions &
deletions made.

• Activity of file—% records processed during a given run.

• Size of file.

• Response time—amount of time user is willing to wait


before requested operation is completed.

Understanding Operati 21
ng Systems
Sequential Record Organization

• Easiest to implement because records are stored &


retrieved serially, one after other.
• To speed process some optimization features may be built
into system.
– E.g., select a key field from record & then sort records
by that field before storing them.
– Aids search process.
– Complicates maintenance algorithms because original
order must be preserved every time records added or
deleted.
Understanding Operati 22
ng Systems
Direct Record Organization
(Random Organization)
• Uses direct access files which can be implemented only
on direct access storage devices.
• Give users flexibility of accessing any record in any order
without having to begin search from beginning of file.
• Records are identified by their relative addresses (their
addresses relative to beginning of file).
– Logical addresses computed when records are stored
& again when records are retrieved.
– Use hashing algorithms.

Understanding Operati 23
ng Systems
Advantages of Direct Access Organization

• Fast access to records.


• Can be accessed sequentially by starting at first relative
address & incrementing it by one to get to next record.
• Can be updated more quickly than sequential files because
records quickly rewritten to original addresses after
modifications.
• No need to preserve order of the records, so adding or
deleting them takes very little time.

Understanding Operati 24
ng Systems
Collisions Are a Problem With Direct
Access Organization
• Several records with unique keys may generate same
logical address (collision).
• Program generates another logical address before
presenting it to File Manager for storage.
• Colliding records stored in overflow area via links.
• File Manager handles physical allocation of space.
• Maximum file size established when created & eventually
file is full or too many records are stored in overflow area.
• Programmer must reorganize & rewrite file.

Understanding Operati 25
ng Systems
Indexed Sequential Record Organization

• Combines best of sequential & direct access.


• Created & maintained through Indexed Sequential Access
Method (ISAM) software package.
• Doesn’t create collisions because it doesn’t use result of
hashing algorithm to generate a record’s address.
– Uses info to generate index file through which records retrieved.
• Divides ordered sequential file into blocks of equal size.
– Size determined by File Manager to take advantage of physical
storage devices & to optimize retrieval strategies.
• Each entry in index file contains highest record key &
physical location of data block where this record, & records
with smaller keys, are stored.
Understanding Operati 26
ng Systems
Indexed Sequential - 2

• To access any record in file, system begins by searching index file &
then goes to physical location indicated at that entry.
• Overflow areas are spread throughout file
– Existing records can expand & new records are in close physical &
logical sequence.
– Last-resort overflow area is located apart from main data area but
is used only when the other overflow areas are completely filled.
• When retrieval time becomes too slow, file has to be reorganized..
• Allows both direct access to a few requested records & sequential
access to many records for most dynamic files.
• A variation of indexed sequential files is B-tree.

Understanding Operati 27
ng Systems
Physical Storage Allocation

• File Manager must work with files not just as whole units
but also as logical units or records.
• Records within file must have same format but can vary in
length.
• Records are subdivided into fields.
– Structure usually managed by application programs, not
OS.
• When we talk about file storage, we’re actually referring to
record storage .

Understanding Operati 28
ng Systems
(a) Unblocked, fixed-length
R1 R2 R3 R4 R5 R6 records

Block # R1 R2 R3 Block (b) Blocked, fixed length


1 Recs records
2

R1 R1 R2 R2 (c) Unblocked, variable-


Length Length
length records

R1 # R2 # R3 (d) Unblocked, variable-


length records

Block Block # R1 R1 R2 R2 (e) Blocked, variable-length


1 Size Recs. Len. records
Len.
Understanding Operati 29
ng Systems
Contiguous Storage

• Records stored one after other.


• Any record can be found & read once starting address &
size are known, so directory is very streamlined.
• Direct access easy – every part of file is stored in same
compact area.
• Files can’t be expanded unless there’s empty space
available immediately following it.
– Room for expansion must be provided when file is created.
• Fragmentation occurs (slivers of unused storage space).
– Can compact & rearrange files.
– Files can’t be accessed while compaction is taking place.

Understanding Operati 30
ng Systems
Noncontiguous Storage

• Allows files to use any storage space available on disk.

• File’s records are stored in a contiguous manner if enough


empty space.

• Any remaining records, & all other additions to file, are


stored in other sections of disk (extents).
– Linked together with pointers.
– Physical size of each extent is determined by OS (e.g.,
256 bytes).

Understanding Operati 31
ng Systems
Linking File Extents

1. Linking at storage level – each extent points to next one in sequence.


– Directory entry consists of file name, storage location of first
extent, location of last extent, & total number of extents, not
counting first.
2. Linking at directory level – each extent listed with its physical
address, size, & pointer to next extent.
• A null pointer indicates that it's last one.

• Eliminate external storage fragmentation & need for compaction.


• Don’t support direct access because no easy way to determine exact
location of specific record.

Understanding Operati 32
ng Systems
Indexed Storage

• Allows direct record access by bringing pointers linking


every extent of that file into index block.
• Every file has its own index block (addresses of each disk
sector that make up the file)
– Lists each entry in same order in which sectors linked .
• When a file is created, pointers in index block set to null.
• As each sector is filled, pointer set to appropriate sector
address.
– Address is removed from empty space list & copied into
its position in index block.
Understanding Operati 33
ng Systems
Indexed Storage - 2

• Supports both sequential & direct access.


• Doesn’t necessarily improve use of storage space because
each file must have index block.
• For larger files with more entries, several levels of indexes
can be generated.
– To find a desired record, File Manager accesses first
index (highest level), which points to a second index
(lower level), which points to an even lower level index
& eventually to data record.

Understanding Operati 34
ng Systems
Data Compression

1. Several techniques (3) used to save space in files.


2. System must be able to distinguish between compressed &
uncompressed data.
3. Trade-off: storage space gained, but processing time lost.

4. Records with repeated characters can be abbreviated.


– E.g., fixed-length field with short name & many blank characters;
replaced with variable-length field & special code to indicate #
blanks truncated.
ADAMSbbbbbbbbbb  ADAMSb10
  300000000  3#8

Understanding Operati 35
ng Systems
Data Compression: Repeated Terms

2. Repeated terms compressed by using symbols to represent


each of most commonly used words in the database.
– E.g., in a university’s student database common words
like student, course, teacher, classroom, grade, &
department could each be represented with single
character.

Understanding Operati 36
ng Systems
Data Compression : Front-end
Compression
3. Front-end compression used for index compression.
– For example, student database where the students’
names are kept in alphabetical order could be
compressed Original list Compressed list
Smith, Betty Smith, Betty
Smith, Gino 7Gino
Smith, Donald 7Donald
Smithberger, John 5berger, John
Smithbren, Ali 6ren, Ali
Smithco, Rachel 5co, Rachel
Smither, Kevin 5er, Kevin
Smithers, Renny 7s, Renny
Snyder, Katherine 1nyder, Katherine

Understanding Operati 37
ng Systems
Access Methods

• Access methods dictated by a file’s organization


• Most flexibility is allowed with indexed sequential files
and least with sequential.
– File organized in sequential fashion can support only
sequential access to its records, & these records can be
of fixed or variable length.
– File Manager uses the address of last byte read to
access the next sequential record.
– Current byte address (CBA) must be updated every
time a record is accessed.
Understanding Operati 38
ng Systems
Sequential Access

• For sequential access of fixed-length records, CBA


updated by incrementing it by record length (RL), which is
constant:
CBA = CBA + RL

• For sequential access of variable-length records, File


Manager adds length of record (RLk) plus number of bytes
used to hold record length (N) to CBA.
CBA = CBA + N + RLk

Understanding Operati 39
ng Systems
Direct Access & Fixed-Length Records

• If file is organized in direct fashion, accessed easily in


direct or sequential order if have fixed-length records.

• For direct access with fixed length records, CBA


computed directly from record length & desired record
number RN (info provided through READ command)
minus one:
CBA=(RN–1) * RL

Understanding Operati 40
ng Systems
Direct Access & Variable-Length Records

• Virtually impossible to access a record directly because address of


desired record can’t be easily computed.
• To access a record, File Manager must do sequential search through
records.
– If File Manager saves address of last record accessed, can do half-
sequential read through file. When next request arrives it could
search forward from CBA.
– Or File Manager can keep table of record numbers & their CBAs.
Search table for exact storage location of desired record.
• To avoid this problem, many systems force users to have files
organized for fixed-length records if want direct access to records.

Understanding Operati 41
ng Systems
Access of Records in Indexed Sequential
File
• Accessed either sequentially or directly,
• Either CBA computations apply but with one extra step.
– Index file must be searched for pointer to block where data stored.
– Because index file is smaller, kept in main memory & quick
search to locate block where desired record is located.
– Block retrieved from secondary storage & beginning byte address
of record calculated.
• In systems with several levels of indexing, index at each
level must be searched before computing CBA.
– Entry point to this type of data file is usually through index file.

Understanding Operati 42
ng Systems
Levels in a File Management System

• Efficient management of files Basic File System


can’t be separated from
efficient management of
devices that house them. Access Control Module

• A wide range of functions must Logical File System


be organized for I/O system to
perform efficiently.
Physical File System
• Each level implemented by
using structured & modular Device Interface Module
programming techniques,
which also set up a hierarchy.
Device
Understanding Operati 43
ng Systems
Basic File System

• Highest level module that passes info to logical file


system, which notifies physical file system, which works
with Device Manager.

• Activates access control verification module to verify


that this user is permitted to perform this operation with
this file.

Understanding Operati 44
ng Systems
Access Control Verification Module

• Any file can be shared.


• Saves space & allows for synchronization of data updates.
• Improves efficiency of system's resources, because if files
are shared in main memory, I/O operations reduced.
• However, integrity of each file must be safeguarded
– Control over who is allowed to access file and what
type of access is permitted.
– READ only, WRITE only, EXECUTE only, DELETE
only, or some combination.

Understanding Operati 45
ng Systems
File Access Control Methods

1. Each file management system has own file access control


method.

2. Access control matrix


3. Access control lists Most
4. Capability lists Common Methods
5. Lockword control.

Understanding Operati 46
ng Systems
Access Control Matrix

• Intuitively appealing & easy to implement.


• Works well only for systems with few files & few users.
• In matrix each column identifies a user & each row identifies a file.
• Intersection of row & column has access rights for that user to that file.
User 1 User 2 User 3 User 4 User 5
File 1 RWED R-E- ---- RWE- --E-
File 2 ---- R-E- R-E- --E- ----
File 3 ---- RWED ---- --E- ----
File 4 R-E- ---- ---- ---- RWED
File 5 ---- ---- ---- ---- RWED
R = Read Access
W = Write Access
E = Execute Access
D = Delete Access
- = Access Not Allowed
Understanding Operati 47
ng Systems
Access Control Lists

• Modification of access control matrix technique.


• Each file is entered in list & contains names of users allowed to access
it & type of access permitted.
• To shorten list, only those who may use file are named; those denied
any access are grouped under global heading such as WORLD.
• Or shorten by putting every user into a category:
– SYSTEM – system personnel with unlimited access to all files.
– OWNER – absolute control over all files created in own account.
– GROUP – all users belonging to appropriate group have access.
– WORLD – all other users in system; default access types given by
File Manager.

Understanding Operati 48
ng Systems
Access Control List Example

File Access
File 1 USER1 (RWED), USER2 (R-E-), USER4 (RWE-),
USER5 (--E-), WORLD (----)
File 2 USER2(R-E-), USER3 (R-E-), USER4 (--E-), WORLD (-
---)
File 3 USER2(RWED), USER4 (--E-), WORLD (----)
File 4 USER1(R-E-), USER5(RWED), WORLD(----)
File 5 USER5(RWED), WORLD (----)

Understanding Operati 49
ng Systems
Capability Lists

• Lists every user and files to which each has access.


• Requires less storage space than an access control matrix.
• Easier to maintain than an access control list when users
are added or deleted from system.
User Access
User1 File1 (RWED), File4 (R-E-)
User2 File1 (R-E-), File2 (R-E-), File3 (RWED)
User3 File2 (R-E-)
User4 File1 (RWE-), File2 (--E-), File3 (--E-)
User5 File1 (--E-), File4 (RWED), File5 (RWED)

Understanding Operati 50
ng Systems
Lockword Control

• Lockword is similar to a password but protects a single


file.
– When file created, owner protects it via lockword
– Stored in directory but isn’t revealed with directory listing.
– User must provide correct lockword to access protected file.
• Require smallest amount of storage for file protection.
• Can be guessed by hackers or passed on to unauthorized
users.
• Generally doesn’t control type of access to file.
– Anyone who knows lockword can read, write, execute, or delete
file.

Understanding Operati 51
ng Systems
Terminology

• access control list • directory


• access control matrix • extension
• capability list • extents
• complete file name • file
• current byte address (CBA) • file descriptor
• current directory • fixed-length record
• data compression • hashing algorithm
• data file • indexed sequential record
• database organization
• device independent • key field
• direct access files • lockword
• direct record organization • logical address

Understanding Operati 52
ng Systems
Terminology - 2

• logical address
• master file directory (MFD)
• relative address
• relative file name
• sequential record organization
• subdirectory
• variable-length record
• volume
• working directory

Understanding Operati 53
ng Systems

You might also like