File Organization
File Organization
A file system is a fundamental component in computing that governs how files are
named, stored, and retrieved from a storage device.
1. Definition:
o A file system defines the rules and structures for managing files and
directories on storage devices.
o It ensures efficient access to data by organizing files and providing
mechanisms for creating, reading, updating, and deleting them. When
you open a file, copy, edit, or delete it, the file system handles these
operations behind the scenes.
2. Importance:
o Without a file system, storage devices would be like chaotic rooms
with scattered papers.
o A file system transforms raw data into organized files, making the
storage device useful.
o Beyond bookkeeping, it manages space, metadata, encryption, access
control, and data integrity.
3. Partitioning:
o Before use, storage devices are partitioned into logical regions.
o Partitioning allows separate management of different regions as if they
were distinct storage devices.
o Tools provided by operating systems or system firmware handle
partitioning.
4. Responsibilities:
o Space management: Allocating and reclaiming storage space
efficiently.
o Metadata: Storing information about files (e.g., size, permissions,
timestamps).
o Data encryption: Protecting sensitive data.
o File access control: Enforcing permissions.
o Data integrity: Ensuring files remain consistent.
File Organization
File organization ensures that records are available for processing. It is used to
determine an efficient file organization for each base relation.
For example, if we want to retrieve employee records in alphabetical order of name.
Sorting the file by employee name is a good file organization. However, if we want to
retrieve all employees whose marks are in a certain range, a file is ordered by
employee name would not be a good file organization.
Storing and sorting in contiguous block within files on tape or disk is called
sequential access file organization.
In sequential access file organization, all records are stored in a sequential order. The
records are arranged in the ascending or descending order of a key field.
Sequential file search starts from the beginning of the file and the records can be
added at the end of the file.
In sequential file, it is not possible to add a record in the middle of the file without
rewriting the file.
Advantages of sequential file
It is simple to program and easy to design.
Sequential file is best use if storage space.
Disadvantages of sequential file
Sequential file is time consuming process.
It has high data redundancy.
Random searching is not possible.
2. Direct access file organization
Direct access file is also known as random access or relative file organization.
In direct access file, all records are stored in direct access storage device (DASD),
such as hard disk. The records are randomly placed throughout the file.
The records does not need to be in sequence because they are updated directly and
rewritten back in the same location.
This file organization is useful for immediate access to large amount of information.
It is used in accessing large databases.
It is also called as hashing.
Advantages of direct access file organization
Direct access file helps in online transaction processing system (OLTP) like online
railway reservation system.
In direct access file, sorting of the records are not required.
It accesses the desired records immediately.
It updates several files quickly.
It has better control over record allocation.
Disadvantages of direct access file organization
Direct access file does not provide back up facility.
It is expensive.
It has less storage space as compared to sequential file.
3. Indexed sequential access file organization
Indexed sequential access file combines both sequential file and direct access file
organization.
In indexed sequential access file, records are stored randomly on a direct access
device such as magnetic disk by a primary key.
This file have multiple keys. These keys can be alphanumeric in which the records
are ordered is called primary key.
The data can be access either sequentially or randomly using the index. The index is
stored in a file and read into memory when the file is opened.
Advantages of Indexed sequential access file organization
In indexed sequential access file, sequential file and random file access is possible.
It accesses the records very fast if the index table is properly organized.
The records can be inserted in the middle of the file.
It provides quick access for sequential and direct processing.
It reduces the degree of the sequential search.
Disadvantages of Indexed sequential access file organization
Indexed sequential access file requires unique keys and periodic reorganization.
Indexed sequential access file takes longer time to search the index for the data
access or retrieval.
It requires more storage space.
It is expensive because it requires special software.
It is less efficient in the use of storage space as compared to other file organizations.
Primitive data types are the data types available in most of the programming
languages.
These data types are used to represent single value.
It is a basic data type available in most of the programming language.
Data type derived from primary data types are known as Non-Primitive
data types.
Non-Primitive data types are used to store group of values.
It can be divided into two types:
Types Description
Arrays Array is a collection of elements. It is used in mathematical
problems like matrix, algebra etc. each element of an array
is referenced by a subscripted variable or value, called
subscript or index enclosed in parenthesis.
Linked list Linked list is a collection of data elements. It consists of two
parts: Info and Link. Info gives information and Link is an
address of next node. Linked list can be implemented by
using pointers.
Stack Stack is a list of elements. In stack, an element may be
inserted or deleted at one end which is known as Top of the
stack. It performs two operations: Push and Pop. Push
means adding an element in stack and Pop means removing
an element in stack. It is also called Last-in-First-out
(LIFO).
Queue Queue is a linear list of element. In queue, elements are
added at one end called rear and the existing elements are
deleted from other end called front. It is also called as First-
in-First-out (FIFO).
In linked list, each node consists of its own data and the address of the
next node and forms a chain.
The above figure shows the sequence of linked list which contains data
items connected together via links. It can be visualized as a chain of
nodes, where every node points to the next node.
Linked list contains a link element called first and each link carries a data
item. Entry point into the linked list is called the head of the list.
Link field is called next and each link is linked with its next link. Last link
carries a link to null to mark the end of the list.
Note: Head is not a separate node but it is a reference to the first node. If
the list is empty, the head is a null reference.
The real life example of Linked List is that of Railway Carriage. It starts
from engine and then the coaches follow. Coaches can traverse from one
coach to other, if they connected to each other.
1. Create
2. Insert
3. Delete
4. Traverse
5. Search
6. Concatenation
7. Display
1. Create
2. Insert
Inserting an element
The above figure represents the example of create operation, where the
next element (i.e 22) is added to the next node by using insert operation.
3. Delete
int delete (node** head, node* n); // Delete the node n if exists.
i. From the beginning of the list
When deleting the node from the beginning of the list then there is no
relinking of nodes to be performed; it means that the first node has no
preceding node. The above figure shows the removing node with x.
However, it requires to fix the pointer to the beginning of the list which is
shown in the figure below:
Deleting a node from the middle requires the preceding node to skip over
the node being removed.
The above figure shows the removal of node with x. It means that there is
a need refer to the node before we can remove it.
Deleting a node from the end requires that the preceding node becomes
the new end of the list that points to nothing after it. The above figure
shows removing the node with z.
4. Traverse
5. Search
7. Display
Each node has a single link to another node is called Singly Linked List.
Singly Linked List does not store any pointer any reference to the
previous node.
Each node stores the contents of the node and a reference to the next
node in the list.
In a singly linked list, last node has a pointer which indicates that it is the
last node. It requires a reference to the first node to store a single linked
list.
It has two successive nodes linked together in linear way and contains
address of the next node to be followed.
It has successor and predecessor. First node does not have predecessor
while last node does not have successor. Last node have successor
reference as NULL.
It has only single link for the next node.
In this type of linked list, only forward sequential movement is possible,
no direct access is allowed.
In the above figure, the address of the first node is always store in a
reference node known as Head or Front. Reference part of the last node
must be null.
Doubly linked list is a sequence of elements in which every node has link
to its previous node and next node.
Traversing can be done in both directions and displays the contents in the
whole list.
In the above figure, Link1 field stores the address of the previous node
and Link2 field stores the address of the next node. The Data Item field
stores the actual value of that node. If we insert a data into the linked list,
it will be look like as follows:
Important Note:
First node is always pointed by head. In doubly linked list, previous field
of the first node is always NULL (it must be NULL) and the next field of the
last must be NULL.
In the above figure we see that, doubly linked list contains three fields. In
this, link of two nodes allow traversal of the list in either direction. There
is no need to traverse the list to find the previous node. We can traverse
from head to tail as well as tail to head.
In doubly linked list, each node requires extra space for previous pointer.
All operations such as Insert, Delete, Traverse etc. require extra previous
pointer to be maintained.
Circular linked list is similar to singly linked list. The only difference is that
in circular linked list, the last node points to the first node in the list.
It is a sequence of elements in which every element has link to its next
element in the sequence and has a link to the first element in the
sequence.
In the above figure we see that, each node points to its next node in the
sequence but the last node points to the first node in the list. The
previous element stores the address of the next element and the last
element stores the address of the starting element. It forms a circular
chain because the element points to each other in a circular way.
In circular linked list, the memory can be allocated when it is required
because it has a dynamic size.
Circular linked list is used in personal computers, where multiple
applications are running. The operating system provides a fixed time slot
for all running applications and the running applications are kept in a
circular linked list until all the applications are completed. This is a real
life example of circular linked list.
We can insert elements anywhere in circular linked list, but in the array
we cannot insert elements anywhere in the list because it is in the
contiguous memory.
Doubly circular linked list is a linked data structure which consists of a set
of sequentially linked records called nodes.
Doubly circular linked list can be conceptualized as two singly linked lists
formed from the same data items, but in opposite sequential orders.
The above diagram represents the basic structure of Doubly Circular
Linked List. In doubly circular linked list, the previous link of the first node
points to the last node and the next link of the last node points to the first
node.
In doubly circular linked list, each node contains two fields called links
used to represent references to the previous and the next node in the
sequence of nodes.
Array is a collection of elements having same data Linked list is an ordered collection of elem
type with common name. connected by links.
Array elements can be stored in consecutive manner Linked list elements can be stored at any
in memory. as address of node is stored in previous
Insert and delete operation takes more time in Insert and delete operation cannot take
array. performs operation in fast and in easy w
It can be single dimensional, two dimensional or It can be singly, doubly or circular linked
multidimensional.
Each array element is independent and does not Location or address of element is stored
have a connection with previous element or with its of previous element or node.
location.
Array elements cannot be added, deleted once it is The nodes in the linked list can be added
declared. from the list.
In array, elements can be modified easily by In linked list, modifying the node is a com
identifying the index value.
Pointer cannot be used in array. So, it does not Pointers are used in linked list. Elements
require extra space in memory for pointer. using pointers or links. So, it requires ex
space for pointers.