File System
File System
What should be there inside the file system to provide these functions?
1) Routines (allocation, free space mgmt.)
2) Code corresponding to system calls
3) Information
(a) some information is needed only when one system is on. e.g. if some process has opened
some file.
This information is kept in tables in main memory and that is part of file system.
(b) tables in each secondary storage devices , that contains entries like file
name and corresponding sector no. or block no.
Format Operation :-To store raw information on a fresh disk , we have to create the sectors called as
format operation.
File System Creation:- To store data in terms of file , we have to create tables on secondary storage
device called file system creation.
In DOS, Format command for formatting + file system creation.
In Unix, Format for format operation
mkfs for file system creation operation
In order to specify unique location on disk, three dimensional address is neededSurface number, track number, sector number.
Large block size are disadvantageous in case of smaller file sizes because for a smaller file also
we have to allocate a larger size block. So space wastage is there due to internal fragmentation.
So, there is a trade off to decide the optimum size of block.
1. Bitmap
Free block=1
Reserved block=0
Compared to 3rd, less space required.
2. Counting
3 3
7 2
From block no. 3 , 3 blocks are free.
From block no. 7 , 2 blocks are free.
ALLOCATION ALGORITHMS:1) Contiguous allocation method:To allocate a file, all the logical blocks should be contiguous.
EX:
Holes available is of size 3,2,1,5 file is of size 2.
Hole of size n means n contiguous free blocks.
1. Choose hole which is just enough to meet the requirement i.e. hole of size 2 (Best Fit Method of
continuous allocation). After doing allocation, if file grows & requires more
expansion (require one more block),then we will not be able to allocate that.
2. Choose hole which is largest in size. i.e. hole of size 5.
(Worst fit Method of continuous allocation)
We defragmented the larger hole into two smaller holes, then external fragmentation can be a problem.
External Fragmentation:- If the total space available is large enough to meet our requirement, but
blocks are scattered so that we cannot allocate them to a file.
Internal Fragmentation:- Wastage of space due to partially occupied block
We cannot avoid internal fragmentation in any case.
Both best fit & worst fit takes large time because they have to scan the whole list to find the smallest hole
or largest hole respectively.
First Fit (First hole that meets requirement) i.e hole of size 3
Factors
1) kind of environment e.g. real time system in which time is important, we use first fit.
2) Nature of application e.g. database application in which file size grows continuously, we use worst fit.
3) Where we are not going to modify the files e.g. OS utilities, we use best fit.
Choosing best fit or first fit, we want to expand our file, what can we do?
1) Find out some hole that meets our requirement, and shift file there. It takes time
in :
a) read and write operations required, which takes time
b) new entries have to be made in file system tables.
Solution for External Fragmentation:Bring all the free blocks & keep them at the one place .Bring all occupied blocks at one place resulting in
a large hole. This is called defragmentation or compaction.
This is also time consuming due to the same reasons (above two a and b).
ADVANTAGES OF CONTIGUOUS ALLOCATIONS:
1. Access time is very less because blocks are contiguous.
(seek time and rotational latency are less)
Seek time: time to move the R/W head to the proper cylinder.
Rotational delay: time for the proper sector to rotate under the head.
2. Random(Direct) access is possible.
DISADVANTAGES :
1. it suffers from external fragmentation .
If the system is time critical . e.g. real time system then we use this method.
If space is less . e.g. very small hard disk, then we dont go for this method.
LINKED ALLOCATION:
We keep a linked list of blocks pertaining to a file
In directory table:
For contiguous allocation
File name
Starting
Size of file
address
Inode Table
Data Area
Boot block consists of Routine related to booting operation which is not directly concerned with the
file system
1.
2.
3.
4.
5.
Super block consist information about the state of the file system like:
Total no of logical blocks.
No. of free blocks
A list of free blocks
Pointer to next free block
Total no of inodes
I-node number
Free Space management technique in standard unix FS -> keeping address of all free blocks at one place
In ext2 file system in redhat Linux (given in Tanenbaum and Galvin in case study part)
bitmap
Data area also contains a linked list of free blocks. Pointer of first free block is available in super block.
Whenever there is deallocation of file corresponding i-node entry is made in super block.
Similarly, free block entries are also added in super block.
Copy of super block is kept in RAM(main memory), So information available on RAM. So time will
save as we need not require to access hard disk for each allocation or deallocation. When we shutdown
our system this super block is copied to hard disk. All the updations are done in super block in main
memory. Super block at hard disk is updated at regular intervals by sync() system call.
What method Unix uses for allocation of blocks?
i-node table contains address of 10 direct blocks ,
11th entry contain the address of index block (which further contain the address of data blocks of file ). i.e
single indirect or first level of indexing
12th entry second level of indexing. (double indirect)
13th entry third level of indexing. (triple indirect)
Maximum file size supported by unix= (10+ 256 + 256*256 + 256*256*256) blocks.
Assumes 256 data block addresses are there in one block.
For small files,space overhead is less & also access time is less, therefore 10 direct blocks are used.
Boot
block(512
bytes)
Fat
#1 (14
physical
block)
Fat
#2 (14
physical
block)
Root
directory
table
Data area
ROOT DIRECTORY
Filename
XYZ
Free cluster
Free cluster
7
230
(-1)End of file
Directory Structures:A directory is a table in which we keep info about the files.
Root directory table:- by default root directory table is created whenever file system is created.
Sometimes it is also called as device directory.
Directory can be regarded as files.
1.
F1
F2
F3
---------------
Fn
F1
Names of
ordinary
files(not the
directory)
Fn
U1
F1
U2
F2
F3
F4
3. Tree Structure:-
Sharing of files (between user1 and user2)Copy file from user2 to user1, but problems are1. Copying task is cumbersome; it takes time in read and writes operation. Suppose 10 users wants to
share a file, copying task is then more cumbersome.
2. Consistency will not be there. Different users does not feel that file is in their directory.
In order to avoid inconsistency, copy of the file should be physically one. Also users should feel that file
is in their own directory.
Tree structure cannot realize this, so it cannot employ sharing of files. So we go for graph structure.
u1
F1
u1
F2
Make an entry of file in the directory table of different users with same inode number who wants to share
the file so that the file remains physically one, but different users feel that they have file in their own
directory table.
File deletion for a shared file should not be done until there is atleast one user.
DELETION ALGORITHM
1. Delete the directory entry of file .
2. If the link count in inode is 1, file should be deleted physically if the link count is not 1, decrement it.
It is also possible to share the directories (To look at the listing of files, To create files in the directory, to
delete files in the directory).
Tree structure should not be converted to cyclic graph structures.
ln -d command ( for directory sharing using hard links) can only be used only by superuser for
ensuring that the graph structure is not cyclic or even superuser cannot execute it on some systems.
That is why; graph structure is acyclic graph structure in UNIX.
Reasons:
1> There is no usefulness of such type of sharing
2> If there is such type of sharing , problem is: Suppose we give ls -lR command then it will show
recursive listing of files infinitely.
- ln s d1 d2 can be used for creating symbolic links between directories. Here, U can create cycles but ls
lR command will traverse it only once.
Q> Why link count of directory is atleast 2?
Ans> Directory has always two links i.e. a dot and absolute pathname.
Note: Read shared files (page 408) topic from Tanenbaum, Acyclic Graph Directories & General
Graph Directory from Galvin and links from S.Das. Comparison of hard links and symbolic links is
not there in notes.
Dig. of user file descriptor table, global file table and incore inode table.
Process is program under execution in main memory. Any program or process uses files.
Whenever a process is created, a file descriptor table is created in memory for each process.
Suppose process executes open system call to open file f1.
int fd;
fd= open ( f1, mode);
open system call returns lowest unoccupied file descriptor from file descriptor table. It points to some
location of file table (consisting of RDWR ptr and MODE info.) & from there, it points to incore inode of
file f1.
Now, any operations on file ( read, write, close, lseek, dup etc.) always use file descriptor and not the
filename.
read( fd, buffer,512) ----- reads first 512 bytes of f1
read (fd, buffer, 512)----- reads next 512 bytes of f1 (RDWR ptr corresponding to fd is at 512)
fd1=open (f1, Mode) (opening second instance of f1 makes the value of reference count=2 in incore
inode entry of file f1)
read( fd1, buffer,512) ----- reads first 512 bytes of f1 (RDWR ptr corresponding to fd1 is at 0)
Additional info. in incore inode table1. reference count that how many instances of file are opened.
2. inode no.
3. a flag that indicates whether changes have been made in the incore inode or not.
for close(fd1)
1. Unoccupy fd1 from fd table, corresponding entry of global file table and reference count is
decrement by 1.
2. If reference count becomes 0, remove corresponding incore inode entry and save it in inode table
on secondary storage if changes have been made in it by some process, otherwise simply discard
it.
A process can open 20 instance of files simultaneously as entries in FD
table are only 20 in standard UFS.
Whenever a process is created , 0,1,2 file descriptors are already occupied in FD table.
FD-0 points to standard input device i.e. keyboard
FD-1 and FD-2 points to standard output device i.e. monitor
The process is by default writes its o/p on file descriptor 1 or corresponding channel of FD1 and because
by default FD1 is connected to monitor, so by default o/p goes to monitor.
The process by default reads on FD0 & By default a process writes error on FD2
$ ls
By default , process writes on fD1,output goes to monitor.
$ls>F1
process ls writes on FD1 and output goes to file f1 by breaking the
channel.
fd=open(f1,O_WRONLY);
close(1);
dup(fd);
execl (/bin/ls, ls, 0);
ls doesnt know that this redirection takes place.
Process
Program under Execution in main memory
Dynamic entity because the state of process
changes with respect to time
Consists of code, data and system data.
Ready: A process is created in memory with code, data and system data segment.
( there is a ready queue of process in ready state)
Running: CPU running (executing) the process.
Short term scheduler (CPU scheduler) selects a process from ready queue to execute it on the CPU.
CPU scheduler is invoked after few msec, while LTS is invoked after longer interval ( may be when some
process leaves the system to control degree of multiprogramming)
WAITING : process is waiting for I/O devices.
The transition from WAITING state to END state in not possible. The last instruction in process is always
a cpu instruction .i.e. exit().
SWAPPED OUT :we realized later that process should not be accepted by long term scheduler that
process goes to swapped out state by medium term scheduler. Factors that MTS considers are same as for
LTS.
In interactive systems, there is no LTS as response time is important factor in case of interactive systems.
DEADLOCK : if the process is infinitely waiting in the waiting queue.
Read 3.1, 3.2, 3.3 topics from Galvin.
- fork() system call creates a child process.
Syntax: id=fork()
It returns id=0 to the child process, & id=pid of child process to the parent process. Address of id is
same, both in the child & in the parent.
- In Fig. 3.9 of Galvin, there will be no wait() in else part, if shell executes the child process in
background.
- One of the reasons of process creation by shell is that shell can redirect I/O of newly created child
process.
Read pipe from class notes and S.Das III ED or Design and implementation of Unix OS by M.Bach or
Galvin or .
Creation of FS on hard disk :Boot block does not go to 1st physical sector of hard disk, it may go to 2nd sector of hard disk & so on .
A hard disk may have multiple OS on same hard disk, we have to partition hard disk into two
logical hard disk .
DOS
UNIX
I sector contains partition table (which contains info about available partitions on the disk) + partition
table loader routine.
Partition no
type of OS
1
2
Starting address
size of partition
Active
partition
DOS
Unix
Booting process:1. Control goes to predefined location of ROM consisting of ROM bootstrap routine. Execution of ROM
Bootstrap routine takes place.
2. Control goes to very first sector of the disk & execution of PT loader routine takes place. It loads the
partition table in memory & see how many partitions are available & which partition is currently active.
3. Then control is transferred to very first sector of active partition which consists of Bootblock
(consisting of RAM bootstrap routine)
Remaining steps will be different for DOS and Unix OS.
Power on self test (POST) routine is three in BIOS which check all the devices before booting starts.
Certain utilities are available to partition hard disk
fdisk utility (in both dos and unix)
User may decide:
1.
2.
3.
4.
No of partitions
Type of partition
Active partition
size of partition
Partition table is also called as:- Master Boot record in DOS or Windows.
- Volume table of control (VTOC) in UNIX.
LILO Boot (Linux Loader Boot) & GRUB are the boot loaders for linux. They also ask from user to make
a particular partition as active.
Formatting of hard disk:When disk is fresh disk, low level formatting or physical formatting to create sector out of the tracks and
it also creates an empty partition table, it will erase all the data on the disk.
Partition table
I(DOS)
Unix-1
Unix-2
Unix 1
Root file System
(Consists of OS files)
Unix 2
(Consists of
user files)
In UFS, when we boot our system, root file system is accessible by default.
By default , other file systems are not accessible. To make them accessible ,
we have to mount it on some empty directory of root FS.
File System
$mount <
Which is to be
mounted.
>
>