note_dbms
note_dbms
Serializability of Scheduling
1. Introduction
In a multi-user database environment, multiple transactions often execute concurrently to
improve performance and resource utilization. However, concurrent execution can lead to
problems like inconsistency, lost updates, dirty reads, and uncommitted data.
To preserve database correctness, the DBMS uses a concept called serializability, which
ensures that the concurrent execution of transactions is equivalent to some serial
execution of those transactions.
2. Definition
Serializability is the highest level of isolation in concurrency control. A schedule is said to
be serializable if its result is equivalent to some serial schedule — i.e., one where
transactions are executed one after another without overlapping.
In other words, although transactions may interleave during actual execution, the final
outcome (state of the database) must be as if they were executed in some serial order.
3. Types of Schedules
Serial Schedule: Transactions execute one after another with no interleaving.
4. Types of Serializability
4.1 Conflict Serializability
Based on the idea of conflicting operations:
Read-Write ( RW )
Write-Read ( WR )
note 1
Conflict serializability is checked using a Precedence Graph (also known as a
Serialization Graph):
Steps:
2. For every conflicting operation between Ti and Tj, add an edge from Ti → Tj if Ti's
operation comes first.
Each read operation in one schedule reads the same value as in the other.
Enforces that transactions must commit in the same order as the serialization order.
5. Importance of Serializability
Ensures consistency and correctness of transactions.
Dirty Read
Lost Update
Non-repeatable Read
Phantom Read
note 2
Timestamp Ordering
6. Serializability vs Recoverability
Concept Focus Ensures
7. Example
Consider two transactions:
T1: T2:
Read(A) Read(A)
Write(A) Write(A)
If interleaved as:
T1: Read(A)
T2: Read(A)
T2: Write(A)
T1: Write(A)
8. Limitations
View serializability is undecidable in general; conflict serializability is used more in
practice due to its algorithmic testability.
note 3
Conclusion
Serializability is the cornerstone of concurrency control in DBMS. It ensures that
concurrent execution of transactions results in a database state that could be obtained
under some serial execution, thereby preserving correctness. While conflict serializability
is easier to implement and verify, view serializability offers a broader notion of
correctness. Understanding and enforcing serializability is essential for designing reliable
and consistent transaction processing systems.
2. Exclusive Lock (X-lock): Allows a transaction to read and write a data item. Only one
transaction can hold an exclusive lock on a data item at a time.
note 4
Shared (S) Exclusive (X)
Exclusive (X) No No
1. Growing Phase: Transaction can acquire locks but cannot release any.
2. Shrinking Phase: Transaction releases locks but cannot acquire new ones.
Strict 2PL: A special case where a transaction holds all its locks until it commits or aborts.
This ensures recoverability and avoids cascading aborts.
Timeouts
note 5
read_TS(X): Largest timestamp of any transaction that read X
Read Rule:
Write Rule:
2.3 Advantages
Deadlock-free: Since transactions don’t wait, there are no cycles in wait-for graphs.
2.4 Disadvantages
May cause frequent rollbacks (especially under high contention)
Does not support recoverability and cascading abort prevention by default — needs
extensions like Thomas’ Write Rule or multi-version concurrency control (MVCC).
note 6
Recovery Support Easier with strict 2PL Needs additional mechanisms
Conclusion
Both Lock-Based and Timestamp-Based Schedulers are fundamental techniques in
concurrency control. Lock-based schedulers are more commonly used due to their
simplicity and direct support for recoverability, especially when implemented with strict
2PL. Timestamp-based schedulers are conceptually elegant and avoid deadlocks, but
require careful handling to manage rollbacks and ensure recoverability.
The choice between the two depends on the workload characteristics, system
requirements, and design priorities of the DBMS.
1.2 Motivation
In traditional locking schemes, readers and writers can block each other. MVCC eliminates
this issue by letting readers see a consistent snapshot of the database as of the time they
began, without waiting for writers to finish.
note 7
Sometimes: read timestamps, transaction ID
Operations:
Read(X):
write_TS(Xk) ≤ TS(T)
Write(X):
Reduced deadlocks
1.6 Example
Assume T1 starts at time TS=10 , and T2 updates a record at TS=12 . T1 will continue to read
the older version (written before TS=10 ) even if T2 commits. This guarantees that T1 reads a
consistent snapshot of the database.
note 8
Optimistic Concurrency Control (OCC) is based on the assumption that conflicts between
transactions are rare. Therefore, it allows transactions to execute without restrictive control
and only validates at the end whether the transaction can commit.
1. Read Phase
The transaction reads values and performs computations using local copies.
2. Validation Phase
The system checks whether committing this transaction would violate serializability.
3. Write Phase
If validation passes, changes are written to the database.
note 9
2.5 Disadvantages of OCC
High rollback rate in high-contention workloads
Mobile/disconnected environments
Concurrency
Multi-version snapshot Single-version, optimistic validation
Model
Rollbacks Rare (only for write conflicts) Can be frequent under high contention
Conclusion
Both MVCC and OCC aim to improve concurrency by avoiding locks, but they take different
approaches:
MVCC sacrifices storage for read performance, ideal for workloads with frequent reads
and moderate writes.
OCC delays conflict detection until commit time, making it suitable for distributed or
mobile systems with low contention.
note 10
Modern systems may combine both strategies or dynamically adapt between them
depending on the workload and contention level.
System crashes
Power outages
Software bugs
Disk failures
Transaction aborts
To achieve this, Database Recovery is the process of restoring the database to a correct
state after a failure.
2. Types of Failures
1. Transaction Failure
2. System Failure
3. Media Failure
4. Application/Software Failure
note 11
3. Recovery Objectives
Undo the effects of incomplete or failed transactions
4. Key Concepts
4.1 Transaction States
Active: Executing
Partially Committed
<T_i, START>
<T_i, COMMIT>
<T_i, ABORT>
Logs are typically written to stable storage before actual data is changed (Write-Ahead
Logging – WAL).
5. Recovery Techniques
5.1 Deferred Update (No-Undo/Redo)
Changes are not written to the database until the transaction commits.
note 12
On failure: no need to undo uncommitted changes.
On failure:
5.3 Checkpoints
A checkpoint is a snapshot of the database state written to disk at regular intervals to
reduce recovery time.
Process:
During recovery, the system can skip scanning the entire log and start from the most recent
checkpoint.
Before writing a data block to disk, its log entry must be written first
Recovery Actions:
note 13
At commit, pointers are updated atomically to the new pages.
Advantages:
No logs required
Simple to implement
Disadvantages:
Uses WAL
Phases:
note 14
Committed Redo
Uncommitted Undo
10. Conclusion
Database recovery is essential for ensuring atomicity and durability in transaction
processing. Whether through log-based approaches, shadow paging, or advanced
techniques like ARIES, the DBMS must be equipped to handle failures gracefully and
restore the database to a consistent state. Recovery mechanisms form the backbone of
reliable data management in mission-critical systems.
note 15