0% found this document useful (0 votes)
27 views4 pages

File Management

File management-Operating systems

Uploaded by

Maqbul Hanif
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views4 pages

File Management

File management-Operating systems

Uploaded by

Maqbul Hanif
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

FILE MANAGEMENT

File system – concerned with managing secondary storage space particularly disk storage. It consists of
two distinct parts:
 A collection of files each storing related data
 A directory structure which organizes and provide information about all the files in the system.
A file – is a named collection of related information that is recorded on secondary storage.

File Naming
A file is named for convenience of its human users and a name is usually a string of characters e.g.
“good.c”. When a file is named it becomes independent of the process, the user and even the system that
created it i.e. another user may edit the same file and specify a different name.

File Types
An o/s should recognize and support different files types so that it can operate and file in reasonable
ways.
Files types can be implemented by including it as part of file name. The name is split into 2 parts – a
name and extension usually separated by a period character. The extension indicates the type of file and
type of operations that can be on that file.

File Type Usual extensions


Executable .exe , .com
Source Code .c, .p, .pas, .bas, .vbp, .prg, .java
Text .txt, .rtf, .doc
Graphics .bmp, .gpg, .mtf
Spreadsheet .xls
Archive .zip, .arc.rar (compressed files)

File Attributes
A part fro the name other attributes of files include
 Size – amount of information stored in the file
 Location – Is a pointer to a device and to the location of file on the device
 Protection – Access control to information in file (who can read, write, execute and so on)
 Time, Date and user identification –This information is kept for creation, last modification and
last use. These data can be useful for protection, security and usage monitor.
 Volatility – frequency with which additions and deletions are made to a file
 Activity – Refers to the percentage of file’s records accessed during a given period of time.

File Operations
File can be manipulated by operations such as:
 Open – prepare a file to be referenced
 Close – prevent further reference to a file until it is reopened
 Create – Build a new file
 Destroy – Remove a file
 Copy – create another version of file with a new name.
 Rename – change name of a file.
 List – Print or display contents.
 Move – change location of file
Individual data items within the file may be manipulated by operations like:
 Read – Input a data tem to a process from a file
 Write – Output a data item from a process to a file
 Update – modify an existing data item in a file
 Insert – Add a new data item to a file
 Delete – Remove a data item to a file
 Truncating – delete some data items but file retains all other attributes

File Structure
Refers to internal organization of the file. File types may indicate structure. Certain files must conform to
a required structure that is understood by the o/s. Some o/s have file systems that does support multiple
structure while others impose (and support) a minimal number of file structures e.g. MS DOS and UNIX.
UNIX considers ach file to be a sequence of 8-bit bytes. Macintosh o/s supports a minimal no of file
structure and it expects executables to contain 2 parts – a resource fork and a data fork. Resource fork
contains information of importance to user e.g. labels of any buttons displayed by program.

File Organization
Refers to the manner in which records f a file are arranged on secondary storage.
The most popular schemes are:-
a) Sequential
Records placed in physical order. The next record is the one that physically follows the previous record. It
is used for records in magnetic tape.
b) Direct
Records are directly (randomly) accessed by their physical addresses on a direct Access by storage device
(DASD). The application user places the records on DASD in any order appropriate for a particular
application.
c) Indexed Sequential
Records are arranged in logical sequence according to a key contained in each record. The system
maintains an index containing the physical addresses of certain principal records. Indexed sequential
records may be accessed sequentially in key order or they may be accessed directly by a search through
the system created index. It is usually used inn disk.

d) Partitioned
It is a file of sequential sub files. Each sequential sub file is called a member. The starting address of each
member is stored in the file directory. Partitioned files are often used to store program libraries.

FILE ALLOCATION METHODS


Deals with how to allocate disk space to different files so that the space is utilized effectively and files
can be accessed quickly. Methods used include:
1. Contiguous Allocation
Files are assigned to contiguous areas of secondary storage. A user specifies in advance the size of area
needed to hold a file to be created. If the desired amount of contiguous space is not available the file
cannot be created.
Advantages
i) Speeds up access since successive logical records are normally physically adjacent to one another
ii) File directories are relatively straight forward to implement for each file merely retain the address
of start of file and its length.
Disadvantages
i) Difficult to find space for a new file especially if the disk is fragmented into a number of separate
holes (block)
To prevent this, a user can run a repackaging routine (Disk defragmenter)
ii) Another difficulty is determining how much space is needed for a file since if too little space is
allocated the file cannot be extended or cannot grow.
Some o/s use a modified contiguous allocation scheme where a contiguous chunk of space is
allocated initially and then when that amount is not large enough, another chunk of contiguous
space, an extent is added.
2. Non Contiguous Allocation
It is used since files do tend to grow or shrink overtime and because users rarely know in advance how
large a file will be. Techniques used include:
a) Linked Allocation
It solves all problems of contiguous allocation. Each file is a linked list of disk blocks, the disk blocks
may be scattered anywhere on the disk. The directory contains a pointer to the first and last disk block of
the file. Each Block contains a pointer to the next block.
Advantage
 Eliminates external fragmentation
Disadvantages
 It is very effective only for sequential access files i.e. where you have to access all blocks.
 Space may be wasted for pointers since they take 4 bytes. The solution is to collect blocks into
multiples clusters rather than blocks.
 Not very reliable since loss or damage of pointer would lead to failure
b) Indexed Allocation
It tries to support efficient direct access which is not possible in linked allocation. Pointers to the blocks
are brought together into one location (index block). Each file has its own index block, which is an array
of disk block addresses.
Advantage
 Supports Direct access without suffering from external fragmentation
Disadvantage
 Suffers from wasted space due to pointer overhead of the index block
 The biggest problem is determining how large the index block would be
The mechanisms to deal with this include:-
i) Linked scheme – index block is normally one disk block; to allow large files we may
link together several index blocks.
ii) Multilevel index – use a separate index block to point to the index blocks which point
to the files blocks themselves
iii) Combined scheme – keeps the first say 15 pointers of the index block in the file’s
index block. The first 12 of these pointers point to direct blocks that contain
addresses of blocks that contain data of the file. The next 3 pointers point to indirect
blocks which may contain address to blocks containing data.

FILE MANAGEMENT TECHNIQUES


File Implementation
It is concerned with issues such as file storage and access on the most common secondary storage
medium the hard disk. It explores ways to allocate disk space, to recovers freed space, to track the
locations of data and to interface other parts of the o/s to the secondary storage.
Directory implementation
The selection of directory allocation and directory management algorithms has a large effect on the
efficiency, performance and reliability of the file system.
Algorithms used:
a) Linear is the simplest method of implementing a directory. It uses a linear list of filenames with
pointers to the data blocks. It requires a linear search to find a particular entry.
b) Hash Table – A linear list stores the directory entries but a hash data structure is used. The hash
table takes a value computed from the file name and returns a pointer to the file name in the linear
list.
Free Space Management
To keep track of free disk space the system maintains a free space list which records all disk blocks that
are free (i.e. those not allocated to some file or directory). To create a file we search the free space list for
the required amount of space and allocate that space to the new file. This space is then removed from the
free space list. Techniques used to manage disk space include:-
a) Bit vector – Free space list is implemented as a bit map or bit vector. Each block is represented by
1 bit. If the block is free, the bit is 1; if the block is allocated, the bit 0.
b) Linked List – links together all the free disk blocks, keeping a pointer to the first free block in a
special location on the disk and caching it in memory. This first block contains a pointer to the
next free disk block and so on.
c) Grouping – Is a modification of free list approach and stores the addresses of n free blocks in the
first free block. The first n-1 of these blocks are actually free blocks and so on. The importance of
this implementation is that the addresses of a large number of free blocks can be found quickly
unlike in the standard linked list approach.
d) Counting – it takes advantage of the fact that generally several contiguous blocks may be
allocated or freed simultaneously, particularly when space is allocated with contiguous allocation
algorithm or through clustering. Thus rather than keeping a list of n-free disk addresses we can
keep the address of the first free block and the number n of free contiguous blocks that follow the
first block. Although each entry requires more space that would a simple dist address; the overall
list will be shorter as long as the count is generally greater than 1.

Efficiency and Performance


This is essential when it comes to files and directory implementation since disks tends to be a major
bottleneck in system performance i.e. they are the slowest main computer component. To improve
performance disk controllers are provided to include enough local memory to create on-board cache that
may be sufficiently large to store an entire track at a time.

File Sharing
In a multi user system there is almost a requirement for allowing files to be shared among a number of
users. Two issues arise:-
i) Access rights – should provide a number of options so that the way in which a particular file is
accessed can be controlled. Access rights assigned to a particular file include: Read, execute,
append, update, delete.
ii) Simultaneous access – a discipline is required when access is granted to append or update a file to
more than one user. An approach can allow a user to lock the entire file when it is to be updated.

You might also like