Advanced Database Chapter 456
Advanced Database Chapter 456
Chapter Four
CONCURRENCY CONTROL
1
1/9/2024
For example:
• Consider the below diagram where two transactions TX and TY, are performed on
the same account A where the balance of account A is $300.
2
1/9/2024
• At time t1, transaction TX reads the value of account A, i.e., $300 (only read).
• At time t2, transaction TX deducts $50 from account A that becomes $250 (only
deducted and not updated/write).
• Alternately, at time t3, transaction TY reads the value of account A that will be
$300 only because TX didn't update the value yet.
• At time t4, transaction TY adds $100 to account A that becomes $400 (only added
but not updated/write).
• At time t6, transaction TX writes the value of account A that will be updated as
$250 only, as TY didn't update the value yet.
• Similarly, at time t7, transaction TY writes the values of account A, so it will write
as done at time t4 that will be $400. It means the value written by TX is lost, i.e.,
$250 is lost.
Hence data becomes incorrect, and database sets to inconsistent.
3
1/9/2024
4
1/9/2024
• At time t1, transaction TX reads the value from account A, i.e., $300.
• At time t2, transaction TY reads the value from account A, i.e., $300.
• At time t3, transaction TY updates the value of account A by adding
$100 to the available balance, and then it becomes $400.
• At time t4, transaction TY writes the updated value, i.e., $400.
• After that, at time t5, transaction TX reads the available value of
account A, and that will be read as $400.
• It means that within the same transaction TX, it reads two different
values of account A, i.e., $ 300 initially, and after updation made by
transaction TY, it reads $400. It is an unrepeatable read and is therefore
known as the Unrepeatable read problem.
• Thus, in order to maintain consistency in the database and avoid
such problems that take place in concurrent execution,
management is needed, and that is where the concept of
Concurrency Control comes into role.
Concurrency Control
• Concurrency Control is the working concept that is required
for controlling and managing the concurrent execution of
database operations and thus avoiding the inconsistencies in
the database.
• Therefore, maintaining the concurrency of the database,
concurrency control protocols are used.
5
1/9/2024
Lock-Based Protocol
• In this type of protocol, any transaction cannot read or write data until it acquires an
appropriate lock on it.
• The lock is a variable that denotes those operations that can be executed on the
particular data item.
• There are two types of lock:
1. Shared lock(Read-only lock or lock-s)
• In a shared lock, the data item can only read by the transaction.
• It can be shared between the transactions because when the transaction holds a lock, then it can't
update the data on the data item.
• So, any number of transaction can hold lock-s(shared lock)
2. Exclusive lock(lock-x):
• In the exclusive lock, the data item can be both reads as well as written by the transaction.
• This lock is exclusive, and in this lock, multiple transactions do not modify the same data
simultaneously.
• Lock-x(exclusive lock) is hold only by one transaction at a time.
6
1/9/2024
Lesson- 2
7
1/9/2024
8
1/9/2024
Example:
In this example, if lock conversion is allowed then the following phase can happen:
1. Upgrading of lock (from S(A) to X (A)) is allowed in growing phase.
2. Downgrading of lock (from X(a) to S(a)) must be done in shrinking phase.
Transaction T1:
• Growing phase: from step 1-3
• Shrinking phase: from step 5-7
• Lock point: at 3
Transaction T2:
• Growing phase: from step 2-6
• Shrinking phase: from step 8-9
• Lock point: at 6
9
1/9/2024
10
1/9/2024
11
1/9/2024
12
1/9/2024
END OF
CHAPTER FOUR
13
1/9/2024
Chapter Five
Recoverability of Schedule
and
Database Recovery
We have discussed
Non-serial schedules which are not serializable
are called as non-serializable schedules.
Non-serializable schedules may be recoverable or
irrecoverable.
1
1/9/2024
Recoverable Schedules
If in a schedule,
A transaction performs a dirty read operation from
an uncommitted transaction and its commit
operation is delayed till the uncommitted
transaction either commits or roll backs then such
a schedule is called as a Recoverable Schedule.
2
1/9/2024
• Cascading Schedule-
• If in a schedule, failure of one transaction causes several other dependent
transactions to rollback or abort, then such a schedule is called as
a Cascading Schedule or Cascading Rollback or Cascading Abort.
• Which leads to the wastage of CPU time.
• Here,
• Transaction T2 depends on transaction T1.
• Transaction T3 depends on transaction T2.
• Transaction T4 depends on transaction T3.
• In this schedule,
• The failure of transaction T1 causes the transaction T2 to rollback.
• The rollback of transaction T2 causes the transaction T3 to rollback.
• The rollback of transaction T3 causes the transaction T4 to rollback.
• Such a rollback is called as a Cascading Rollback.
• NOTE-
• If the transactions T2, T3 and T4 would have committed before the
failure of transaction T1, then the schedule would have been
irrecoverable.
3
1/9/2024
Cascadeless Schedule
• If in a schedule, a transaction is not allowed to read a data item
until the last transaction that has written it is committed or
aborted, then such a schedule is called as a Cascadeless
Schedule.
• In other words,
• Cascadeless schedule allows only committed read operations.
• Therefore, it avoids cascading roll back and thus saves CPU time.
Example
NOTE-
• Cascadeless schedule allows only committed read operations.
• However, it allows uncommitted write operations.
4
1/9/2024
Strict Schedule
• If in a schedule, a transaction is neither allowed to read
nor write a data item until the last transaction that has
written it is committed or aborted, then such a schedule
is called as a Strict Schedule.
• In other words,
Strict schedule allows only committed read and write
operations.
Which means strict schedule implements more
restrictions than cascadeless schedule.
Example
5
1/9/2024
Remember-
Strict schedules are more strict than cascadeless schedules.
All strict schedules are cascadeless schedules.
All cascadeless schedules are not strict schedules.
Database Recovery
Purpose of Database Recovery
To bring the database into the last consistent state, which existed prior
to the failure.
To preserve transaction properties (Atomicity, Consistency, Isolation
and Durability).
Example:
If the system crashes before a fund transfer transaction completes its
execution, then either one or both accounts may have incorrect value.
Thus, the database must be restored to the state before the
transaction modified any of the accounts.
6
1/9/2024
Types of Failure
• The database may become unavailable for use due to
Transaction failure: Transactions may fail because of
incorrect input, deadlock, incorrect synchronization.
System failure: System may fail because of addressing
error, application error, operating system fault, RAM
failure, etc.
Media failure: Disk head crash, power interruption, etc.
Transaction Log
For recovery from any type of failure data values prior
to modification (BFIM - BeFore Image) and the new
value after modification (AFIM – AFter Image) are
required.
BFIM: The old value of the data item before
updating
AFIM: The new value of the data item after
updating
These values and other information is stored in a
sequential file called Transaction log.
7
1/9/2024
Data Update
• Immediate Update: As soon as a data item is modified in
cache, the disk copy is updated.
• Deferred Update: All modified data items in the cache is
written either after a transaction ends its execution or after a
fixed number of transactions have completed their
execution.
• Shadow update: The modified version of a data item does
not overwrite its disk copy but is written at a separate disk
location.
Recovery Scheme:
1. Deferred Update (No Undo/Redo)
The data update goes as follows:
• A set of transactions records their updates in the log.
• At commit point under WAL scheme these updates are saved on
database disk.
• Write-Ahead Logging ( WAL ) is a standard method for
ensuring data integrity which means a mechanism used to
ensure data durability, consistency, and recovery in the event
of failures.
• After reboot from a failure the log is used to redo all the
transactions affected by this failure.
• No undo is required because no AFIM is flushed to the disk
before a transaction commits.
8
1/9/2024
3.Shadow Paging
The AFIM does not overwrite its BFIM but recorded at
another place on the disk.
Thus, at any time a data item has AFIM and BFIM
(Shadow copy of the data item) at two different places on
the disk.
9
1/9/2024
10
1/9/2024
Cont.….
• The Global Transaction Manager (GTM) is Responsible for
recoverability in MDBS
• GTM is a set of Interface Servers (servers, for short), and
multiple LDBSs.
• To each LDBS, there is an associated server.
• An LDBS consists of a DBMS and at least one database.
• The GTM comprises three modules:
1. Global Transaction Interface (GTI)
2. Global Scheduler (GS)
3. Global Recovery Manager (GRM).
11
1/9/2024
Group Assignment
Group 1: Data Fragmentation Techniques in Distributed
database Design
Group 2: Replication Allocation Techniques for Distributed
database Design
Group 3: Types of Distributed Database Systems
Group 4: Query Processing in Distributed Databases
End of Chapter-5
12
Chapter 6
Data Structures
The Set
A set in data structure is a collection of unique elements, often
referred to as distinct values or items.
It is an abstract data type that can store a collection of data
elements without any particular order.
Each element in the set is unique, meaning that no two elements
can have the same value.
Sets are commonly used in computer science and programming
to solve problems that involve storing and manipulating distinct
elements.
They provide operations such as inserting an element into the
set, removing an element from the set, checking if an element is
present in the set, and performing operations like union,
intersection, and difference between sets.
It does not allow duplicate elements. When an element is inserted into a set, the set
checks if the element already exists, and if it does, the insertion is ignored.
Sets are also often implemented using data structures such as arrays, linked lists,
or binary trees.
The choice of implementation depends on the specific requirements of the
problem at hand, such as the expected size of the set, the expected number of
operations, and the desired time complexity for the operations.
Set Implementation Classes
Implementing a set using the HashSet class in Java:
import java.util.HashSet;
import java.util.Set;
public class SetExample {
public static void main(String[] args) {
// Create a set of integers
Set<Integer> numberSet = new HashSet<>();
// Add elements to the set
numberSet.add(10);
numberSet.add(20);
numberSet.add(30);
numberSet.add(40);
// Print the set
System.out.println("Number Set: " + numberSet);
// Check if an element exists in the set
boolean contains20 = numberSet.contains(20);
System.out.println("Contains 20: " + contains20);
// Remove an element from the set
boolean removed = numberSet.remove(30);
System.out.println("Removed 30: " + removed);
// Enqueue elements
queue.add(“Chala");
queue.add(“Abebe");
queue.add("Mike");