Organization of Files - Sequential & Multitable
Organization of Files - Sequential & Multitable
Module 4
ORGANIZATION OF RECORDS IN FILES
Given a set of records of a relation, the different ways to organize the records in a
file are:
1. Heap file organization: Any record can be placed anywhere in the file where there is
space for the record. There is no ordering of records.
2. Sequential file organization: Records are stored in sequential order, according to the
value of a “search key” of each record.
A search key is any attribute or set of attributes; it need not be the primary key, or
even a super key.
For fast retrieval of records in search-key order, records are chained together using
pointers which points to the next record in search-key order.
To minimize the number of block accesses in sequential file processing, records are
stored physically in search-key order, or as close to search-key order as possible.
SEQUENTIAL FILE ORGANIZATION
3. In either case, adjust the pointers so as to The structure allows fast insertion of new records
chain together the records in search-key order.
SEQUENTIAL FILE ORGANIZATION
Many relational database systems store the records (fixed length) of each
relation in a separate file. This simple implementation is well suited to low-cost,
small sized database; also it requires reduced amount of code to implement the
system.
For each tuple of department, the system must locate the instructor tuples with the
same value for dept name. These records will be located with the help of indices.
Regardless of how these records are located, they need to be transferred from disk into
main memory.
In the worst case, each record will reside on a different block, forcing us to do one block
read for each record required by the query.
MULTITABLE CLUSTERING FILE ORGANIZATION
The instructor tuples for each ID are stored near the department tuple for the corresponding dept name.
When a tuple of the department relation is read, the entire block containing that tuple is copied from disk
into main memory. Since the corresponding instructor tuples are stored on the disk near the department
tuple, the block containing the department tuple contains tuples of the instructor relation needed to
process the query.
If a department has so many instructors that the instructor records do not fit in one block, the remaining
records appear on nearby blocks.
MULTITABLE CLUSTERING FILE ORGANIZATION
As the multitable clustering file organization stores related records of two or more relations in each
block, it allows us to read records that would satisfy the join condition by using one block read.
Thus, we are able to process this particular query more efficiently.
This clustering of multiple tables into a single file has enhanced processing of a particular join (that
of department and instructor), but it results in slowing processing of other types of queries.
MULTITABLE CLUSTERING FILE ORGANIZATION
For example, the query select * from department; requires more block accesses than
it did in the scheme under which we stored each relation in a separate file, since each
block now contains significantly fewer department records.
To locate efficiently all tuples of the department relation in the multitable structure,
chain together all the records of that relation using pointers.
Careful use of multitable clustering can produce significant performance gains in query
processing.