Failure Classification in DBMS
Failure Classification in DBMS
Disk failures are usually due to hardware errors with the hard disk drive itself.
These may include read/write head crashes, platter damage due to improper
handling, and power surges that can cause permanent corruption of data on the
disk.
System crashes are often caused by software bugs, but are are also sometimes
caused by hardware errors that corrupt or corrupt the operating system. Software
bugs may be due to a programming error in the database management system, a
weak SQL statement, or a misconfiguration of the DBMS system. In either case,
when there is corruption of data on disk, the system is unable to recover from this
state and is forced to enter a reboot state.
Generally, a transaction failure and disk failure are independent events and may
occur simultaneously, but only the transaction failure is reported, and this may be
further limited to a small number of transactions. A system crash, on the other
hand, can occur at any time and causes all transactions on the system to fail.
Other types of system error such as IO error can be grouped under System crash
because it causes all transactions on the system to fail with no exception.
When a system crash occurs, the system usually indicates the failure by displaying
an error message and enters a reboot state. When a disk failure occurs, however,
it does not always cause a system crash. In that case, the transaction returns an
error message or fault to the application program that attempted to access the
data on that disk. Depending on how the DBS handles this type of fault, the DBS
may return all of errors from all disks in a system or from certain disks only.
Recovery
When a system with concurrent transactions crashes and recovers, it behaves in
the following manner −
Recovery
The recovery system reads the logs backwards from the end to the last
checkpoint.
If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn,
Commit>, it puts the transaction in the redo-list.
If the recovery system sees a log with <Tn, Start> but no commit or abort log
found, it puts the transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are removed.
All the transactions in the redo-list and their previous logs are removed and then
redone before saving their logs.