0% found this document useful (0 votes)
69 views41 pages

File MGMT

The document discusses file system management in an operating system. It describes how the OS maps files onto physical storage devices, manages file creation and deletion, and provides access to files. It also discusses file attributes like name, size, location; different file types like regular files and directories; and common file operations like reading, writing, opening and closing files.

Uploaded by

Roshan Nandan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views41 pages

File MGMT

The document discusses file system management in an operating system. It describes how the OS maps files onto physical storage devices, manages file creation and deletion, and provides access to files. It also discusses file attributes like name, size, location; different file types like regular files and directories; and common file operations like reading, writing, opening and closing files.

Uploaded by

Roshan Nandan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

FILE SYSTEM

1/29/2017 prepared by : Er. Roshan Nandan 1


File-System Management:
• It becomes essential to store information for long-term so that it can be accessed at any time.
• it is also essential to make data sharable among various processes.
• This information can be huge in size and therefore, must be accommodated on the appropriate
storage devices.
• Computers can store info on several different types of physical media (e.g. magnetic disk, optical
disk…)
• Each medium is controlled by a device (e.g. disk drive)
• The OS maps files onto physical media, and accesses these files via the storage devices
• File = a collection of related information
• The OS implements the abstract concept of a file by managing mass storage media and the
devices that control them
• The OS is responsible for these file management activities:
• Creating and deleting files
• Creating and deleting directories
• Supporting primitives for manipulating files & directories
• Mapping files onto secondary storage
• Backing up files on stable (non-volatile) storage media
1/29/2017 prepared by : Er. Roshan Nandan 2
• File Concept :
• A file is sequence of logical records i.e. a sequence of bits and bytes.
• File is a named collection of related info on secondary storage
• Data can't be written to secondary storage unless its in a file
• A file has a certain defined structure according to its type :
• Text file: sequence of characters organized into lines
• Source file: sequence of subroutines & functions
• Object file: sequence of bytes organized into blocks
• Executable file: series of code sections
• Contiguous logical address space
• Types:
• Data
• numeric
• character
• binary
• Program
1/29/2017 prepared by : Er. Roshan Nandan 3
• File naming:
• Most visible part of OS
• Mechanism to store and retrieve information from the disk
• Represents programs(both source and object) and data
• Data files may be numeric, alphanumeric, or binary
• May be free form or formatted rigidly
• Accessed by name
• Created by process and continues to exist after the process has terminated
• Naming conventions:
• Set a fixed number of characters(letters, digits, special characters)
• Case sensitivity
Many operating systems support two-part file names, with the two parts
separated by a period, as Student.doc
• File naming=File name + File extension
• In this example the Student is the file name and doc is the file extension
1/29/2017 prepared by : Er. Roshan Nandan 4
1/29/2017 prepared by : Er. Roshan Nandan 5
• File Attributes :
• A file's attributes vary from one operating system to another but
typically consist of these:
• Name: The only info kept in human-readable form
• Identifier: Unique tag
• Type: Info needed for those systems that support different types
• Location: Pointer to a device and to the location of the file
• Size: Current size (in bytes, words, or blocks)
• Protection: Access-control info
• Time, date, & user id: Useful for protection & usage monitoring
• Information about files are kept in the directory structure, which is
maintained on the disk
1/29/2017 prepared by : Er. Roshan Nandan 6
1/29/2017 prepared by : Er. Roshan Nandan 7
• File Structure:
• Byte sequence:
• Unix and ms-dos use byte sequence
• Meaning on the bytes is imposed by user level programs
• Provides maximum flexibility but minimal support
• Advantages to users who want to define their own semantics on files
• Record sequence:
• Each file of a fixed length record
• Card readers and line printer based records
• Used in CP/M with a 128 character record
• Tree :
• Useful for searching
• Used in some mainframes.
• In this organization, a file consists of a tree of records, not necessarily all the same
length.
• Each containing a key field in a fixed position in the record.
• The tree is sorted on the key field, to allow rapid searching for a particular key.

1/29/2017 prepared by : Er. Roshan Nandan 8


1/29/2017 prepared by : Er. Roshan Nandan 9
• File Types:
• Many operating systems support several types of files
• Regular files, Directories, Character special files, Block special files.
• Regular files:
• Most contain ASCII characters , binary data, executable program
binaries input or output
• No kernel level support to structure and the contents of these files
• Both sequential and random access are supported
• There are following types of regular files:
• Text files(ASCII)
• Lines of text
• Lines may be terminated by carriage return
• Files itself has an end of file character
• Useful for inter-process communication via pipes in unix

1/29/2017 prepared by : Er. Roshan Nandan 10


• Binary files:
• Not readily readable
• Has internal structure depending upon type of file(executable or archive)
• Executable file(a.out format)
• Header: Magic number, Text size, bss size, symbol table size, Entry point, Flags
• Text
• Data
• Relocation bits
• Symbol table
• BSS or block started by symbol
• Uninitialized data for which kernel should allocate space
• Used by an obsolete IBM assemblers , BSS was an assembler pseudo op-code that filled
an area of memory with zero’s
• Library archive: compiled but not linked modules
• Header: Module name, Date, Owner, Protection, Size
• Object module

1/29/2017 prepared by : Er. Roshan Nandan 11


1/29/2017 prepared by : Er. Roshan Nandan 12
• File Operations :
• Creating a file:
• First, space in the file system must be found for the file
• Then, an entry for the file must be made in the directory
• Writing a file:
• Make a system call specifying both the name of the file and the info to be written to the
file
• The system must keep a write pointer to the location in the file where the next write is
to take place
• Reading a file:
• Use a system call that specifies the name of the file and where in memory the next block
of the file should be put
• Once the read has taken place, the read pointer is updated
• Repositioning within a file:
• The directory is searched for the appropriate entry and the current-file-position is set to
a given value
• Deleting a file:
• Search the directory for the named file and release all file space and erase the directory
entry
1/29/2017 prepared by : Er. Roshan Nandan 13
• Append:
• This call is a restricted form of WRITE.
• It can only add data to the end of the file.
• Systems that provide a minimal set of system calls do not generally have
APPEND
• many systems provide multiple ways of doing the same thing, and these
systems sometimes have APPEND.
• Seek:
• For random access files, a method is needed to specify from where to take
the data.
• One common approach is a system call, SEEK.
• this repositions the pointer to the current position to a specific place in the
file
• After this call has completed, data can be read from, or written to, that
position
1/29/2017 prepared by : Er. Roshan Nandan 14
• Get Attributes:
• Processes often need to read file attributes to do their work.
• For example, the UNIX make program is commonly used to manage
software development projects consisting of many source files.
• When make is called, it examines the modification times of all the source
and object files and arranges for the minimum number of compilations
required to bring everything up to date.
• To do its job, it must look at the attributes, namely, the modification times.
• Set Attributes:
• Some of the attributes are user settable and can be changed after the file
has been created. This system call makes that possible.
• Rename:
• It frequently happens that a user needs to change the name of an existing
file. This system call makes that possible.
1/29/2017 prepared by : Er. Roshan Nandan 15
• Truncating a file:
• The contents of a file are erased but its attributes stay
• Most of these file operations involve searching the directory for the entry associated
with the named file
• To avoid this constant searching, many systems require that an ‘open’ system call be
used before that file is first used
• The OS keeps a small table containing info about all open files
• When a file operation is requested, the file is specified via an index into the open-file
table, so no searching is required
• When the file is no longer actively used, it is closed by the process and the OS removes
its entry in the open-file table
• Some systems implicitly open a file when the first reference is made to it, and close it
automatically when the program ends
• Most systems require that the programmer open a file explicitly with the ‘open’ system
call before that file can be used
• A per-process table tracks all files that a process has open and includes access rights to
the file & accounting info
• Each entry in the per-process table in turn points to a system-wide open-file table,
which contains process- independent info, such as the file’s disk location, access dates,
and file size
1/29/2017 prepared by : Er. Roshan Nandan 16
• Information associated with an open file:
• File pointer:
• For the system to track the last read-write location
• File open count:
• A counter tracks the number of opens & closes and reaches zero on the last close
• Disk location of the file:
• Location info is kept in memory to avoid having to read it from disk for each
operation
• Access rights:
• Each process opens a file in an access mode
• Open file locking:
• Provided by some operating systems and file systems
• Mediates access to a file
• Mandatory or advisory:
• Mandatory–access is denied depending on locks held and requested
• Advisory–processes can find status of locks and decide what to do
1/29/2017 prepared by : Er. Roshan Nandan 17
• File Access:
• Sequential Access
• Information in the file is processed in order
• The most common access method (e.g. editors & compilers)
• Read: reads the next file portion and advances a file pointer
• Write: appends to the end of file and advances to the new end

• Direct Access
• A file is made up of fixed-length logical records that allow you to read & write records
rapidly in no particular order
• File operations include a relative block number as parameter, which is an index relative
to the beginning of the file
• The use of relative block numbers
• allows the OS to decide where the file should be placed and
• helps to prevent the user from accessing portions of the file system that may not be part of his file

1/29/2017 prepared by : Er. Roshan Nandan 18


• When disks came into use for storing files, it became possible to read
the bytes or records of a file out of order, or to access records by key,
rather than by position.
• Files whose bytes or records can be read in any order are called
random access files

Sequential Access Random Access


1/29/2017 prepared by : Er. Roshan Nandan 19
• In case of random access the record is searched from the disk based
on its direct address information.
• The technique used is Hashing.
• In hashing every record is associated with a key number to
preprocess the address calculation.
• Hash function is used to obtain absolute address of a particular
record
• DIRECTORIES:
• A directory contains information about files.
• A directory is used as a means to group the files owned by a user.
• To keep track of files, file systems normally have directories, which, in
many systems, are themselves files

1/29/2017 prepared by : Er. Roshan Nandan 20


• Hierarchical Directory Systems :
• Directory contains a number of entries- one for each file
• Directory may keep the attributes of a file within itself, like a table ,or may
keep them elsewhere and access them through a pointer

• When a file is opened, the operating system searches its directory until it
finds the name of the file to be opened.
• It then extracts the attributes and disk addresses, either directly from the
directory entry or from the data structure pointed to, and puts them in a
table in main memory.
• All subsequent references to the file use the information in main memory.

1/29/2017 prepared by : Er. Roshan Nandan 21


• Level of Directory:
• Single Directory for all the users:
• Used in most primitive systems
• Can cause conflicts and confusion
• May not be appropriate for any large multiuser system
• One Directory per user:
• Eliminates name confusion across users
• May not be satisfactory if user have many files
• Hierarchical directories:
• Tree like structure
• Root directory sits at the top of the tree, all directories spring out of the root
• Allows logical grouping of files
• Every process can have its own working directory to avoid affecting other
processes
• Used in most of the modern systems

1/29/2017 prepared by : Er. Roshan Nandan 22


1/29/2017 prepared by : Er. Roshan Nandan 23
• Path Names:
• Convention to specify a file in tree- like hierarchy of directories
• Hierarchy starts at the directory /, known as the root directory
• Made up of a list of directories crossed to reach the file, followed by
the file name itself
• The name of directory must be less than 256 characters
• No path name can be longer than 1023 characters
• Absolute path name:
• It is a listing of the directories and files from the root directory to the
intended file.
• For example, the path ‘c:/windows/programs/spss.exe’ means that the root
directory contains a subdirectory ‘windows’, which further contains a
subdirectory ‘programs’, that contains an executable “spss.exe”.
• Always start at the root directory and is unique.
1/29/2017 prepared by : Er. Roshan Nandan 24
• Relative path name:
• This uses the concept of current directory (also known as working
directory).
• Used in conjunction with the concept of “current working directory”
• A user can specify a particular directory as his current working
directory and all the path names instead of being specified from the
root directory are specified relative to the working directory.
• For example, if the current working directory is ‘\user\curr’, then the
file whose absolute path is ‘\user\curr\student’ can be referred
simply as ‘student’.
• More convenient than the absolute form and achieves the same
effect

1/29/2017 prepared by : Er. Roshan Nandan 25


• Directory Operations:
• CREATE:
• A directory is created. It is empty except for dot and dotdot, which are put
there automatically by the system (or in a few cases, by the mkdir program).
• DELETE:
• A directory is deleted. Only an empty directory can be deleted.
• A directory containing only dot and dotdot is considered empty as these
usually cannot be deleted.
• OPENDIR:
• Directories can be read.
• For example, to list all the files in a directory, a listing program opens the
directory to read out the names of all the files it contains.
• Before a directory can be read, it must be opened, analogous to opening and
reading a file,

1/29/2017 prepared by : Er. Roshan Nandan 26


• CLOSEDIR:
• When a directory has been read, it should be closed to free up internal table space.
• READDIR:
• This call returns the next entry in an open directory.
• Formerly, it was possible to read directories using the usual READ system call, but that approach
has the disadvantage of forcing the programmer to know and deal with the internal structure of
directories.
• In contrast, READDIR always returns one entry in a standard format, no matter which of the
possible directory structures is being used.
• RENAME:
• In many respects, directories are just like files and can be renamed the same way files can be.
• LINK:
• Linking is a technique that allows a file to appear in more than one directory.
• This system call specifies an existing file and a path name, and creates a link from the existing file
to the name specified by the path. In this way, the same file may appear in multiple directories.
• UNLINK:
• A directory entry is removed.
• If the file being unlinked is only present in one directory (the normal case), it is removed from the
file system.
• If it is present in multiple directories, only the path name specified is removed. The others remain

1/29/2017 prepared by : Er. Roshan Nandan 27


• FILE SYSTEM IMPLEMENTATION:
• Implementing Files:
• Support of primitives for manipulating files and directories.
• There are 4 ways of File system implementation. They are the following:
1. Contiguous allocation
2. Linked list allocation
3. Linked list allocation using an index
4. I-nodes
• Contiguous allocation:
• The simplest allocation scheme is to store each file as a contiguous block of data on
the disk.
• Files can be accessed by knowing the first block of the file on the disk
• Improves performance as the entire file can be read in one operation
• Thus, on a disk having blocks size 1k, a 25k file would be allocated 25 consecutive
blocks.

1/29/2017 prepared by : Er. Roshan Nandan 28


• Linked list allocation:
• The second method for storing files is to keep each one as a linked list of disk
blocks.
• The first word of each block is used as a pointer to the next one.
• The rest of the block is used for storing data
• Files kept as a linked list of disk blocks
• Only the address of the first block appears in the directory entry
• No disk fragmentation
• Disadvantage
• Random access is extremely slow
• Data in the block is not a power of 2 because the pointer takes up a few bytes

1/29/2017 prepared by : Er. Roshan Nandan 29


• Linked List Allocation Using an Index:
• In this technique instead of having a pointer, an index is maintained
• Memory contains a table pointing to each disk block called a FAT(File
allocation table)
• The entire block is available for data
• Random access is easier because the chain must still be followed to find
without making any disk references
• Large file can be access easily
• Disadvantage
• The entire table must be in memory all the time to make it work
• With 20-GB disk and 1-KB block size , the table needs 20 million entries, one
for each of the following disk blocks.
• Each entry has to be a minimum of 3 bytes.
• For speed in lookup , they should be 4 bytes.

1/29/2017 prepared by : Er. Roshan Nandan 30


1/29/2017 prepared by : Er. Roshan Nandan 31
• I-nodes:
• This method is used by UNIX operating system.
• In this scheme, each file is associated with a little table called an
i-node (index node).
• Our last method for keeping track of which blocks belong to which file
is to associate with each file a little table called an i-node (index-
node), which lists the attributes and disk addresses of the file's
blocks.

1/29/2017 prepared by : Er. Roshan Nandan 32


• The first few disk addresses are stored in the i-node itself, so for small
files, all the necessary information is right in the i-node, which is
fetched from disk to main memory when the file is opened.
• For somewhat larger files, one of the addresses in the i-node is the
address of a disk block called a single indirect block.
• This block contains additional disk addresses.
• If this still is not enough, another address in the i-node, called a
double indirect block, contains the address of a block that contains a
list of single indirect blocks.
• Each of these single indirect blocks points to a few hundred data
blocks.
• If even this is not enough, a triple indirect block can also be used.
UNIX uses this scheme.

1/29/2017 prepared by : Er. Roshan Nandan 33


• Allocation Methods:

1/29/2017 prepared by : Er. Roshan Nandan 34


• Indexed Allocation
• Brings all pointers together into the index block.
• Logical view.
• Need index table
• Random access
• Dynamic access without external fragmentation, but have overhead of index block.
• Contiguous Allocation
• Each file occupies a set of contiguous blocks on the disk
• Simple – only starting location (block #) and length (number of blocks) are required
• Random access
• Wasteful of space (dynamic storage-allocation problem) Files cannot grow
• Linked Allocation
• Simple – need only starting address
• Free-space management system – no waste of space
• No random access

1/29/2017 prepared by : Er. Roshan Nandan 35


• Implementing Directories:
• When a file is opened, the operating system uses the path name
supplied by the user to locate the directory entry.
• Depending on the system, this information may be the disk address of
the entire file (contiguous allocation), the number of the first block
(both linked list schemes), or the number of the i-node

1/29/2017 prepared by : Er. Roshan Nandan 36


• Disk Space Management:
• Files are normally stored on disk
• Two general strategies are possible for storing an n byte file
• n consecutive bytes of disk space are allocated,
• The file is split up into a number of (not necessarily) contiguous blocks
• Storing a file as a contiguous sequence of bytes has the obvious problem
that if a file grows, it will probably have to be moved on the disk.
• For this reason, nearly all file systems chop files up into fixed-size blocks
that need not be adjacent
• Block Size:
• Large allocation unit size results in wastage of disk space
• Small allocation unit size normally increases a seek and a rotational delay,
so reading a file consisting of many small blocks will be slow.

1/29/2017 prepared by : Er. Roshan Nandan 37


• Keeping Track of Free Block:
• Free space management is used to reuse the disk space created after
deleting the files.
• We have 4 techniques for Free space management, Which are the
following:
1. Bit map
2. Linked list
3. Grouping
4. Counting
• . Bit map:
• The free space list is implemented as a bit map. Every bit represents
a block on the disk. The bit for a block is 1 if it is free and it is 0 if the
block is allocated.
1/29/2017 prepared by : Er. Roshan Nandan 38
• Linked list:
• This approach maintains a linked list of all the free disk blocks. The
first free block in the list can be pointed out by a head pointer, which
is kept in a special location on the disk.

1/29/2017 prepared by : Er. Roshan Nandan 39


• Efficiency and Performance
• Efficiency dependent on:
• disk allocation and directory algorithms types of data kept in file’s
directory entry
• Performance
• disk cache – separate section of main memory for frequently used
blocks free-behind and read-ahead – techniques to optimize
sequential access
• improve PC performance by dedicating section of memory as virtual
disk, or RAM disk

1/29/2017 prepared by : Er. Roshan Nandan 40


• Recovery
• Consistency checking – compares data in directory structure with data blocks on
disk, and tries to fix inconsistencies
• Use system programs to back up data from disk to another storage device (floppy
disk, magnetic tape, other magnetic disk, optical)
• Recover lost file or disk by restoring data from backup
• Log Structured File Systems
• Log structured (or journaling) file systems record each update to the file system
as a transaction
• All transactions are written to a log
• A transaction is considered committed once it is written to the log
• However, the file system may not yet be updated
• The transactions in the log are asynchronously written to the file system
• When the file system is modified, the transaction is removed from the log
• If the file system crashes, all remaining transactions in the log must still be
performed

1/29/2017 prepared by : Er. Roshan Nandan 41

You might also like