Recovery and Indexing
Recovery and Indexing
Another table i.e., Shadow page table is used when the transaction starts which
is copying current page table. After this, shadow page table gets saved on disk
and current page table is going to be used for transaction
Implementation of shadow paging
Implementation of Shadow paging
• The main use of this technique is maintaining the
consistency in data if failure happens in any case. The
technique is also known with the name of Cut of Place
Updating.
• Output the disk address of the current page table to the fixed
location in stable storage containing the address of the shadow
page table.
ARIES algorithm
Aries algorithm is used for recovery but a little more complicated as
compared to the one we studied earlier.
DirtyPageTable – contains the list of pages that have been updated in the
database buffer. The Dirty Page Table contains an entry for each dirty
page in the buffer, which includes the page ID and the LSN corresponding
to the earliest update to that page.
The Transaction Table contains an entry for each active transaction, with
information such as the transaction ID, transaction status, and the LSN of
the most recent log record for the transaction.
ARIES algorithm
Checkpoint log – A checkpoint log record contains DirtyPageTable and a
list of active transactions. For each transaction, the checkpoint log record
also notes lastLSN, the LSN of the last record written by the transaction
• REDO PHASE
• The REDO phase actually reapplies updates from the log to the database. Certain information in the ARIES
log will provide the start point for REDO, from which REDO operations are applied until the end of the log is
reached information stored by ARIES and in the data pages will allow ARIES to determine whether the
operation to be redone has actually been applied to the database and hence need not be reapplied. Thus
only the necessary REDO operations are applied during recovery
• UNDO
• During the UNDO phase, the log is scanned backwards and the operations of transactions that were active at
the time of the crash are undone in reverse order. The information needed for ARIES to accomplish its
recovery procedure includes the log, the Transaction Table, and the Dirty Page Table. In addition, check
pointing is used. These two tables are maintained by the transaction manager and written to the log during
check pointing.
File organization
Storing the files in certain order is called file organization. The main
objective of file organization is
i. records should be accessed as fast as possible.
ii. Any insert, update or delete transaction on records should be easy,
quick and should not harm other records.
iii. No duplicate records should be induced as a result of insert, update
or delete
iv. Records should be stored efficiently so that cost of storage is
minimal.
Types of file organizations:
• 1. Sequential file organization
• This technique stores the data element in the sequence manner that is
organized one after another
• The file store the unique data attributes for identification and that helps to
place the data element in the sequence. It is logical sequencing in computer
memory that stores the data element for the database management systems.
Sequential file organizations
• 1. Pile file method
• It is a standard method for sequential file organization in which the data
elements are inserted one after another in the order those are inserted.
• In case of a new record being inserted, it is placed at the end position of
the file that is after the last inserted data element or record.
• In the case of the shorted file method scenario, the new data element
or the new record is inserted at the end position of the file. After the
inserting step, It then gets shorted in the ascending or the descending
order based upon the key.
Sequential file organizations
2. Sorted file method
• For the update or data modification scenario, the data element is searched,
and updated based upon the condition. And, after the update operation
completes the shorting process happens to rearrange the data elements, and
the updated data element is placed at the right position of the sequential file
structure.
• for deletion operation, the data item is searched through the shorted
sequence, and mark as delete once it gets identified. After the delete
operation completes the other data elements are get shorted and rearranged
again with the original ascending or descending order.
Sequential file organization
Illustration of insertion in sorted file method using an example:
Heap File Organization
When a file is created using Heap File Organization, the Operating System allocates
memory area to that file without any further accounting details. File records can be
placed anywhere in that memory area. Heap File Organization works with data
blocks. In this method records are inserted at the end of the file, into the data blocks.
No Sorting or Ordering is required in this method. If a data block is full, the new
record is stored in some other block, Here the other data block need not be the very
next data block, but it can be any block in the memory. It is the responsibility of
DBMS to store and manage the new records.
Heap File Organization
• If we want to search, delete or update data in heap file Organization the we will
traverse the data from the beginning of the file till we get the requested record.
Thus if the database is very huge, searching, deleting or updating the record will
take a lot of time.
• Suppose we have four records in the heap R1, R5, R6, R4 and R3 and suppose a
new record R2 has to be inserted in the heap then, since the last data block i.e data
block 3 is full it will be inserted in any of the database selected by the DBMS, lets
say data block 1
Hash file organization
• Hash File Organization uses Hash function computation on some fields of
the records. The output of the hash function determines the location of
disk block where the records are to be placed.
• In this method of file organization, hash function is used to calculate the
address of the block to store the records.
• The hash function can be any simple or complex mathematical function.
• Hence each record is stored randomly irrespective of the order they come.
Hence this method is also known as Direct or Random file organization.
clustered file organization:
• In this mechanism, related records from one or more relations are kept
in the same disk block, that is, the ordering of records is not based on
primary key or search key.
• In this method two or more table which are frequently used to join and
get the results are stored in the same file called clusters. These files will
have two or more tables in the same data block and the key columns
which map these tables are stored only once. This method hence
reduces the cost of searching for various records in different files. All the
records are found at one place and hence making search efficient.
RAID ( Redundant array of independent disk)
• Redundant Array of Independent Disks, is a technology to connect
multiple secondary storage devices and use them as a single storage
media.
• Multi-level Index helps in breaking down the index into several smaller indices in
order to make the outermost level so small that it can be saved in a single disk
block, which can easily be accommodated anywhere in the main memory.
Multilevel index
Indexing data structures
• ISAM (Indexed sequential access method): is a file management system developed
at IBM that allows records to be accessed either sequentially (in the order they
were entered) or randomly (with an index).
• ISAM is a static index structure –effective when the file is not frequently updated.
Not suitable for files that grow and shrink
• In ISAM, leaf pages contains data entries and non leaf pages contains index
entries of the form (search key value, page id)
One level ISAM Tree
Multi level ISAM Tree
Multi level ISAM TREE
• If we want to insert some records into database, overflow
pages might be created in order to insert data into. The ISAM
tree remains static, so insertions and deletions in the data
file do not affect the tree layers.