0% found this document useful (0 votes)
24 views

Lecture 10 File Management

The document discusses file management and organization. It describes desirable file properties like long-term existence and structure. Files are organized into hierarchical structures and accessed through file systems. File systems provide functions to store and retrieve files through operations like create, delete, open, close, read and write. Files have a structure of fields, records, files and databases. The document also discusses different file organizations like sequential, direct, indexed and B-Trees which provide various access methods and performance tradeoffs.

Uploaded by

achiengshila07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Lecture 10 File Management

The document discusses file management and organization. It describes desirable file properties like long-term existence and structure. Files are organized into hierarchical structures and accessed through file systems. File systems provide functions to store and retrieve files through operations like create, delete, open, close, read and write. Files have a structure of fields, records, files and databases. The document also discusses different file organizations like sequential, direct, indexed and B-Trees which provide various access methods and performance tradeoffs.

Uploaded by

achiengshila07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 73

Operating

Systems:
Internals
and
Design Lecture 10
Principles File Management
By Joram M. Makasa
Files
 Data collections created by users
 The File System is one of the most important parts of the OS to a user
 Desirable properties of files:

Long-term existence
• files are stored on disk or other secondary storage and do not disappear when a user logs off

Sharable between processes


• files have names and can have associated access permissions that permit controlled sharing

Structure
• files can be organized into hierarchical or more complex structure to reflect the relationships among
files
File Systems
 Provide a means to store data organized as files as well as a collection of
functions that can be performed on files
 Maintain a set of attributes associated with the file
 Typical operations include:
 Create
 Delete
 Open
 Close
 Read
 Write
File Structure

Four terms are


commonly used when
discussing files:

Field Record File Database


Structure Terms
Field File
 basic element of data  collection of similar records
 contains a single value  treated as a single entity
 fixed or variable length  may be referenced by name
 access control restrictions
Database usually apply at the file level
 collection of related data
 relationships among elements
Record
of data are explicit  collection of related fields that can
 designed for use by a number be treated as a unit by some
of different applications application program
 consists of one or more types  fixed or variable length
of files
File Management System
Objectives
 A file management system is that set of system software that provides services
to
users and applications in the use of files with following objectives:
 Meet the data management needs of the user
 Guarantee that the data in the file are valid
 Optimize performance
 Provide I/O support for a variety of storage device types
 Minimize the potential for lost or destroyed data
 Provide a standardized set of I/O interface routines to user processes
 Provide I/O support for multiple users in the case of multiple-user systems
Needs of users:Minimal User
Requirements
 Each user:
• should be able to create, delete, read, write and modify files
1

• may have controlled access to other users’ files


2

• may control what type of accesses are allowed to the files


3

• should be able to restructure the files in a form appropriate to the problem


4

• should be able to move data between files


5

• should be able to back up and recover files in case of damage


6

• should be able to access his or her files by name rather than by numeric identifier
7
Device Drivers
 Lowest level
 Communicates directly with peripheral devices
 Responsible for starting I/O operations on a device
 Processes the completion of an I/O request
 Considered to be part of the operating system
Basic File System
 Also referred to as the physical I/O level
 Primary interface with the environment outside the computer system
 Deals with blocks of data that are exchanged with disk or tape systems
 Concerned with the placement of blocks on the secondary storage
device
 Concerned with buffering blocks in main memory
 Considered part of the operating system
Basic I/O Supervisor
 Responsible for all file I/O initiation and termination
 Control structures that deal with device I/O, scheduling, and file status
are maintained
 Selects the device on which I/O is to be performed
 Concerned with scheduling disk and tape accesses to optimize
performance
 I/O buffers are assigned and secondary memory is allocated at this level
 Part of the operating system
Logical I/O

Provides
general-
Enables users purpose record
and I/O capability Maintains
applications to basic data
access records about file
Access Method
 Level of the file system closest to the user
 Provides a standard interface between applications and the file
systems and devices that hold the data
 Different access methods reflect different file structures and different
ways of accessing and processing the data
Identify and
locate the Records
selected as
file Appropriate to
the file
structure

Only authorized users are allowed access


File Organization and Access
 File organization is the logical structuring of the records as determined
by the way in which they are accessed
 In choosing a file organization, several criteria are important:
 short access time
 ease of update
 economy of storage
 simple maintenance
 reliability

 Priority of criteria depends on the application that will use the file
File Organization Types
The pile

The
The direct, or sequential file
hashed, file

Five of the common


file organizations are:

The indexed
The indexed
sequential file
file
The Pile
 Least complicated form of
file organization

 Data are collected in the


order they arrive

 Each record consists of one


burst of data

 Purpose is simply to
accumulate the mass of data
and save it

 Record access is by
exhaustive search
The Sequential
File
 Most common form of file
structure
 A fixed format is used for
records
 Key field uniquely identifies
the record
 Typically used in batch
applications
 Only organization that is easily
stored on tape as well as disk
Indexed
Sequential File
 Adds an index to the file to
support random access

 Adds an overflow file

 Greatly reduces the time


required to access a single
record

 Multiple levels of indexing


can be used to provide
greater efficiency in access
Indexed File
 Records are accessed only through
their indexes

 Variable-length records can be


employed

 Exhaustive index contains one entry


for every record in the main file

 Partial index contains entries to


records where the field of interest
exists

 Used mostly in applications where


timeliness of information is critical

 Examples would be airline


reservation systems and inventory
control systems
Direct or Hashed File
 Access directly any block of a known address
 Makes use of hashing on the key value
 Often used where:
Examples are:
 very rapid access is required
 fixed-length records are used • directories
 records are always accessed one • pricing tables
at a time • schedules
• name lists
B-Trees
 A balanced tree structure with all branches of equal length
 Standard method of organizing indexes for databases
 Commonly used in OS file systems
 Provides for efficient searching, adding, and deleting of items
 every node has at most 2d – 1 keys
and 2d children or, equivalently, 2d
pointers
B-Tree  every node, except for the root, has
at least d – 1 keys and d pointers, as
Characteristics a result, each internal node, except
the root, is at least half full and has
at least d children
 the root has at least 1 key and 2
A B-tree is characterized by its children
minimum degree d and satisfies  all leaves appear on the same level
the following properties: and contain no information. This is
a logical construct to terminate the
tree; the actual implementation may
differ.
 a nonleaf node with k pointers
contains k – 1 keys
Table 12.1

Information
Elements of a
File Directory

(Table can be found on page 537 in textbook)


Operations Performed
on a Directory

List
To understand the requirements for a file structure, it is helpful to
Upd
Cre Del
consider the types of operations that may be performed on the directory:

Sea dire ate


ate ete direc
rch ctor
files files tory
y
Two-Level Scheme
There is one Master directory has an
entry for each user
directory for each directory providing
user and a master address and access control
directory information

Names must be
Each user directory
unique only within
is a simple list of
the collection of files
the files of that user of a single user
File Sharing

Two issues arise when allowing files to be shared among a number of users:

management of
access rights
simultaneous access
Access Rights
 None  Appending
 the user would not be allowed to
read the user directory that includes  the user can add data to the file
the file but cannot modify or delete any
of the file’s contents
 Knowledge
 the user can determine that the file  Updating
exists and who its owner is and can
then petition the owner for
 the user can modify, delete, and
additional access rights add to the file’s data
 Execution  Changing protection
 the user can load and execute a  the user can change the access
program but cannot copy it rights granted to other users
 Reading
 Deletion
 the user can read the file for any
purpose, including copying and  the user can delete the file from
execution the file system
User Access Rights

Owner Specific Users User Groups All

usually the all users who


initial creator have access
of the file individual
a set of users to this
users who system
who are not
has full rights are
individually
designated
defined these are
by user ID
may grant public files
rights to others
Record Blocking
1) Fixed-Length Blocking – fixed-
 Blocks are the unit of I/O length records are used, and an
with secondary storage integral number of records are
 for I/O to be stored in a block
performed records Internal fragmentation – unused
must be organized as space at the end of each block
blocks
2) Variable-Length Spanned Blocking
– variable-length records are used and
are packed into blocks with no unused
space

 Given the size of a block, 3) Variable-Length Unspanned


three methods of blocking Blocking – variable-length records
can be used: are used, but spanning is not
employed
File Allocation
 On secondary storage, a file consists of a collection of blocks
 The operating system or file management system is responsible for
allocating blocks to files

 The approach taken for file allocation may influence the approach taken
for free space management

 Space is allocated to a file as one or more portions (contiguous set of


allocated blocks)

 File allocation table (FAT)


 data structure used to keep track of the portions assigned to a file
Preallocation vs
Dynamic Allocation
 A preallocation policy requires that the maximum size of a file be
declared at the time of the file creation request
 For many applications it is difficult to estimate reliably the maximum
potential size of the file
 tends to be wasteful because users and application programmers tend to
overestimate size

 Dynamic allocation allocates space to a file in portions as needed


Portion Size
 In choosing a portion size there is a trade-off between efficiency from
the point of view of a single file versus overall system efficiency
 Items to be considered:
1) contiguity of space increases performance, especially for
Retrieve_Next operations, and greatly for transactions running in
a transaction-oriented operating system
2) having a large number of small portions increases the size of
tables needed to manage the allocation information
3) having fixed-size portions simplifies the reallocation of space
4) having variable-size or small fixed-size portions minimizes waste
of unused storage due to overallocation
Alternatives
Two major alternatives:

Blocks
Variable, large contiguous • small fixed portions provide
portions greater flexibility
• provides better performance • they may require large tables
• the variable size avoids or complex structures for
waste their allocation
• the file allocation tables are • contiguity has been
small abandoned as a primary goal
• blocks are allocated as needed
Table 12.2
File Allocation Methods
Free Space Management
 Just as allocated space must be managed, so must the unallocated space
 To perform file allocation, it is necessary to know which blocks are
available
 A disk allocation table is needed in addition to a file allocation table
Bit Tables
 This method uses a vector containing one bit for each block on the disk
 Each entry of a 0 corresponds to a free block, and each 1 corresponds to
a block in use

Advantages:
•works well with any file allocation method
•it is as small as possible
Chained Free Portions
 The free portions may be chained together by using a pointer and length
value in each free portion
 Negligible space overhead because there is no need for a disk allocation
table
 Suited to all file allocation methods

Disadvantages:
•leads to fragmentation
•every time you allocate a block you need to read the block first to recover the pointer to the new first free block before writing data to that block
Indexing
 Treats free space as a file and uses an index table as it would for file
allocation
 For efficiency, the index should be on the basis of variable-size
portions rather than blocks
 This approach provides efficient support for all of the file allocation
methods
Free Block List

Each block is assigned


a number sequentially

the list of the numbers of all free blocks is maintained in a


reserved portion of the disk
Volumes
 A collection of addressable sectors in secondary
memory that an OS or application can use for data
storage
 The sectors in a volume need not be consecutive on
a physical storage device
 they need only appear that way to the OS or application

 A volume may be the result of assembling and


merging smaller volumes
UNIX File  In the UNIX file system, six types
of files are distinguished:
Management
Regular, or ordinary
• contains arbitrary data in zero or more data blocks
Directory
• contains a list of file names plus pointers to associated inodes
Special
• contains no data but provides a mechanism to map physical devices to file names
Named pipes
• an interprocess communications facility
Links
• an alternative file name for an existing file
Symbolic links
• a data file that contains the name of the file it is linked to
Inodes
 All types of UNIX files are administered by the OS by means of inodes
 An inode (index node) is a control structure that contains the key
information needed by the operating system for a particular file
 Several file names may be associated with a single inode
 an active inode is associated with exactly one file
 each file is controlled by exactly one inode
File Allocation
 File allocation is done on a block basis
 Allocation is dynamic, as needed, rather than using preallocation
 An indexed method is used to keep track of each file, with part of the
index stored in the inode for the file
 In all UNIX implementations the inode includes a number of direct
pointers and three indirect pointers (single, double, triple)
Table 12.3
Capacity of a FreeBSD File with 4 kByte Block Size
Volume Structure
 A UNIX file
system resides Boot block Superblock Inode table Data blocks
on a single
logical disk or
disk partition
and is laid out contains contains
storage
space
with the code attributes collection
available
required to and of inodes
for data
following boot the information for each
files and
operating about the file
elements: system file system
subdirector
ies
Primary Object Types in VFS

Superblock Object Dentry Object


•represents a specific •represents a specific
mounted file system directory entry

Inode Object File Object


•represents an open
•represents a
file associated with
specific file a process
Windows File System
 The developers of Windows NT designed a new file system, the New
Technology File System (NTFS) which is intended to meet high-end
requirements for workstations and servers
 Key features of NTFS:
 recoverability
 security
 large disks and large files
 multiple data streams
 journaling
 compression and encryption
 hard and symbolic links
NTFS Volume
and File Structure
 NTFS makes use of the following disk storage concepts:

•the smallest physical storage unit on the disk


Sector •the data size in bytes is a power of 2 and is almost
always 512 bytes

•one or more contiguous sectors


Cluster •the cluster size in sectors is a power of 2

•a logical partition on a disk, consisting of one or more clusters and used by

Volume a file system to allocate space


•can be all or a portion of a single disk or it can extend across multiple disks
•the maximum volume size for NTFS is 264 bytes
Table 12.4
Windows NTFS Partition and Cluster Sizes
Master File Table (MFT)
 The heart of the Windows file system is the MFT
 The MFT is organized as a table of 1,024-byte rows, called records
 Each row describes a file on this volume, including the MFT itself,
which is treated as a file
 Each record in the MFT consists of a set of attributes that serve to define
the file (or folder) characteristics and the file contents
Table 12.5
Windows NTFS File and Directory Attribute Types
SQLite
 Most widely deployed SQL database engine in the world

 Based on the Structured Query Language (SQL)

 Designed to provide a streamlined SQL-based database management system


suitable for embedded systems and other limited memory systems

 The full SQLite library can be implemented in under 400 KB

 In contrast to other database management systems, SQLite is not a separate


process that is accessed from the client application
 the library is linked in and thus becomes an integral part of the
application program
Summary
 File structure  Secondary storage management
 File management systems  File allocation
 Free space management
 File organization and access
 Volumes
 The pile
 Reliability
 The sequential file
 UNIX file management
 The indexed sequential file
 Inodes
 The indexed file  File allocation
 The direct or hashed file  Directories
 B-Trees  Volume structure
 File directories  Linux virtual file system
 Contents  Superblock object
 Structure  Inode object
 Naming  Dentry object

 File sharing  File object

 Access rights  Caches

 Simultaneous access  Windows file system


 Key features of NTFS
 Record blocking
 NTFS volume and file structure
 Android file management
 Recoverability
 File system

You might also like