Chapter 4
Chapter 4
College of Technology
Department of Information Technology
1
1. Introduction
• File is a named collection of related information that is stored in secondary storage.
• It is the smallest portion of secondary storage where information is to be stored.
• Information in file can be:
• Data: numeric, alphanumeric, alphabetic or binary form
• Program: source, object, executable,...
Hence, file is
a collection of records if it is a data file or
a collection of bits / bytes / lines if it is code.
• OS uses an application called a file manager for organizing and managing files and folders in
the computer system.
• A file manager enables you to create, delete, copy, move, rename and view files and create
and manage directories (folders).
• The way in which an OS organizes, names, protects, accesses, and uses files is called a file
management system.
,,, Cont’d
- Part of the operating system dealing with files is known as the file system.
- Activities of OS with respect to files
Reading file and writing on file.
Giving permission to the program for operation on file (read-only,
read-write, denied, etc).
Providing interface for user to create/delete files.
Providing interface for user to create/delete directories.
Providing interface to create backup of file system
3
2. Files
Common terms:
Field
• Basic element of data (name, date, etc.)
Record
• Collection of related fields that we treat as a unit (employee record)
• May be of a fixed or variable size
File
• Collection of similar records
• Treated as an entity by applications
• Usually referenced by a name
• Access controls are usually at file level
Database
• Collection of related data files
• Relationships are explicit
• Used by a number of applications 4
File Naming
- Name is used to identify a file with an abstract way.
- A file name consists of strings.
- Most operating systems have file extension, as part of the file name.
- In some operating systems (such as Microsoft Windows), the file
extension is used to associate the file with a program while in others, it
has no special use.
- Some characters, e.g. /, \, >, have special meaning in the file system so
that they cannot be used as part of the file name.
5
File Structure
Three common possibilities to structure files.
In byte sequence, the operating system does
not know or care what is in the file. All it
sees are bytes.
In record sequence, a file is a sequence of
fixed-length records, each with some
internal structure.
In tree records, a file consists of a tree of
records, not necessarily all the same length,
each containing a key field in a fixed
position in the record. Figure : Three kinds of files. (a) Byte sequence. (b) Record sequence.(c) Tree.
6
File Types
The file name consists of name and extension, usually separated by a period character.
The extension indicates type of file and type of operations that can be done on that file.
E.g. only a file with a.com, .exe, or .bat extension can be executed.
The .com and .exe files are two forms of binary executable files, and .bat file is a batch file
File type Usual extension Function
Executable exe, com, bin or none ready-to-run machine language program
Object obj, o compiled, machine language, not linked
Source code c, cc, java, pas, asm, a source code in various languages
Batch bat, sh commands to the command interpreter
Text txt, doc textual data, documents
Word processor wp, tex, rtf, doc various word-processor formats
Library lib, a, so, dll libraries of routines for programmers
Print or view ps, pdf, jpg ASCII or binary file in a format for printing or
viewing
Archive arc, zip, tar related files grouped into one file, sometimes
compressed, for archiving or storage
Multimedia mpeg, mov, rm,mp3, avi binary file containing audio or A/V information
7
File Attributes
10
3) Directories
file systems have directories (folders) to keep track of files.
Single Level Directory Systems
- There is only one directory (the root directory) in the system.
- All files are contained in the same directory.
- It was used in early operating systems.
- Advantage: simple and ease of locating files.
- Disadvantage: providing distinct name for each file.
11
Hierarchical Directory Systems
- there is grouping of related files together.
- each user can have a private root directory for his or her own hierarchy in a
shared system.
- user is allowed to create directory structures of arbitrary levels
12
Path Names
- Path names are used to locate files.
- Two different methods are commonly used.
14
4) Implementing Files
The issue is keep tracking and allocating disk space for files
- There are several alternative ways to allocate disk blocks for files
Contiguous Allocation
- A file is stored in consecutive disk blocks
- Advantage:
Simple to implement
High performance
- Disadvantage
The file size must be specified at the time of file creation.
External fragmentation.
15
Linked List Allocation
- Files are implemented as a linked list of disk blocks.
- First block of file contains the pointer of the next block
- Advantage:
Disk usage is effective; there is no losing disk blocks to holes.
Storing only address of the first block of a file is enough.
- Disadvantage:
Random access of a file is slow
Some space in each block is allocated for the pointer to the next block.
16
Linked List Allocation Using a Table in Memory
- Block pointers are kept to a table, called file
allocation table (FAT)
- The FAT is stored in memory.
- Advantage: random access becomes faster.
- Disadvantage: much memory is required to store the
table
17
I-Nodes
- I-node (index node) stores addresses of disk blocks and attributes of a file.
- Advantage: only the i-nodes of opened files are needed to be in memory.
- If the size of a file is too large to be stored in a disk space pointed by a
single i-node, it is possible to use some of the last i-node entries to point to a
block with more disk block addresses
18
5) Implementing Directories
- Two options
Store the file names after a fixed
header that starts with the length of
the entry. It is followed by the
attributes entry of the file and then
the name.
Alternatively it is possible to keep
the length of entries fixed and keep
file names in a heap at the end of
the directory.
19
6) Important file systems
FAT (File Allocation Table) : old Windows and MSDOS standard
NTFS (New Technology File System): Windows current standard
RAID : Redundant Array of Independent Disks
FFS (Fast File System): Unix standard since 80’s
LFS (log File System): Berkeley redesign for high performance
20
! !!
N D
E
H E
T
21