ADBMS Lec#2
ADBMS Lec#2
Instructor:
Rabia Tariq
Lecturer, Department of Computer Science
Email: [email protected]
File Organization Concepts
• 1. Introduction to File Organization
• 1.1 Why File Organization is Important?
• 2. Types of File Organization in Databases
• 3. Comparison of File Organization Methods
• 4. Choosing the Right File Organization
• 5. Real-World Applications of File Organization
File Organization
• In database systems, data is stored in files, which must be organized
efficiently for quick retrieval and storage.
• File organization refers to the way data records are arranged on
storage media (such as disks). The goal is to ensure fast access,
efficient updates, and minimal storage overhead.
1.1 Why File Organization is
Important?
• Faster data retrieval and query execution.
• Optimized storage space utilization.
• Improved performance in multi-user environments.
• Efficient indexing and searching of records.
• Ensures reliable data access for different types of workloads (OLTP,
OLAP).
2. Types of File Organization in
Databases
• There are several ways to organize files in database systems:
2.1 Heap (Unordered) File
Organization
• It is the simplest and most basic type of organization. It works with data
blocks. In heap file organization, the records are inserted at the file's
end. When the records are inserted, it doesn't require the sorting and
ordering of records.
• When the data block is full, the new record is stored in some other
block. This new data block need not to be the very next data block, but
it can select any data block in the memory to store new records. The
heap file is also known as an unordered file.
• In the file, every record has a unique id, and every page in a file is of the
same size. It is the DBMS responsibility to store and manage the new
records.
Insertion of a new record
Suppose we have five records R1, R3, R6, R4 and R5 in a heap and suppose we want to insert a new record R2 in a heap. If
the data block 3 is full then it will be inserted in any of the database selected by the DBMS, let's say data block 1.
If we want to search, update or delete the data in heap file organization, then we need to traverse the data from staring of
the file till we get the requested record.
If the database is very large then searching, updating or deleting of record will be time-consuming because there is no
sorting or ordering of records. In the heap file organization, we need to check all the data until we get the requested
record.
•This method is inefficient for the large database because it takes time to search or modify the
record.
•This method is inefficient for large databases.
2.1 Heap (Unordered) File Organization
•Records are stored in the order they are inserted, without any specific arrangement.
•Searching for a specific record requires scanning the entire file (linear search).
•Best for: Small databases or when records are rarely searched.
2.2 Sequential File Organization
• Records are stored in a sorted order based on a key field (e.g.,
Student ID).
• Searching is faster because binary search can be used.
• Insertions require reordering to maintain sorting.
• The easiest method for file Organization is the Sequential method. In
this method, the file is stored one after another in a sequential
manner. There are two ways to implement this method:
1. Pile File Method
• This method is quite simple, in which we store the records in a
sequence i.e. one after the other in the order in which they are
inserted into the tables.
Insertion of the new record: Let the R1, R3, and so on up to R5 and R4 be four records in the sequence. Here, records are
nothing but a row in any table. Suppose a new record R2 has to be inserted in the sequence, then it is simply placed at the
end of the file.
2. Sorted File Method
• In this method, As the name itself suggests whenever a new record
has to be inserted, it is always inserted in a sorted (ascending or
descending) manner. The sorting of records may be based on any
primary key or any other key.
2.3 Hash File Organization
• Hash File Organization uses the computation of hash function on
some fields of the records. The hash function's output determines the
location of disk block where the records are to be placed.
• When a record has to be received using the hash key columns, then
the address is generated, and the whole record is retrieved using that
address. In the same way, when a new record has to be inserted, then
the address is generated using the hash key and record is directly
inserted. The same process is applied in the case of delete and
update.
• In this method, there is no effort for searching and sorting the entire
file. In this method, each record will be stored randomly in the
memory.
2.4 Indexed File Organization
• An "indexed file organization" is a method of storing data in a file
system where records are not necessarily ordered sequentially but
are accessed quickly through an index
How it works:
• Create index: When a new record is added to the data file, its key
value is also added to the index along with its corresponding address
in the data file.
• Search operation: When searching for a record, the system looks up
the desired key in the index, retrieving the pointer to the record's
location in the data file.
• Access record: The system directly accesses the record using the
retrieved pointer.
Advantages: