0% found this document useful (0 votes)
6 views25 pages

DBMSass2 Removed

The document discusses various database management concepts, including indexing types (Primary, Clustered, Non-Clustered), B-Trees and B+ Trees properties, transaction properties (ACID), serializability types, hashing techniques, and log-based and timestamp-based protocols. Each section provides definitions, examples, merits, and demerits of the concepts. The content is structured as a comprehensive assignment on database management systems.

Uploaded by

pullurishikhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views25 pages

DBMSass2 Removed

The document discusses various database management concepts, including indexing types (Primary, Clustered, Non-Clustered), B-Trees and B+ Trees properties, transaction properties (ACID), serializability types, hashing techniques, and log-based and timestamp-based protocols. Each section provides definitions, examples, merits, and demerits of the concepts. The content is structured as a comprehensive assignment on database management systems.

Uploaded by

pullurishikhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

100522729022

K.TEJESHWAR

DATABASE MANAGEMENT SYSTEM


ASSIGNMENT -2
Q)Discuss indexing and its types?
Indexing in DBMS is a technique to improve query performance by creating a data structure that helps locate
rows quickly in a table.

Types of Indexing:

Primary Index

 Built on a table's primary key.

 Data in the table is physically sorted based on the primary key.

Example Table: Student

Student_ID (Primary Key) Name Age

101 Alice 20

102 Bob 22

103 Charlie 21

 The index will store Student_ID values (101, 102, 103) with pointers to their rows.

2. Clustered Index

 Determines the physical order of rows in the table.

 Only one clustered index is allowed per table.

Example Table: Employee (Clustered on Salary)

Emp_ID Name Salary (Clustered)

1 John 3000

3 Sarah 4000

2 Emma 5000

 The rows are physically sorted by Salary, and the index reflects this order.
100522729022
K.TEJESHWAR

3. Non-Clustered Index

 Separate structure that contains the indexed column and pointers to rows in the table.

 Can have multiple non-clustered indexes on a table.

Example Table: Products

Product_ID Name Price

201 Keyboard 20

202 Mouse 15

203 Monitor 120

 A non-clustered index on Price stores values (15, 20, 120) with pointers to rows, while the table itself is
unsorted.

Q) write the properties merits and demerits of b trees and b+ trees ?


B-Trees

Properties:

1. Balanced Tree: All leaf nodes are at the same level.

2. Multi-way Tree: Each node can have multiple keys and children (defined by the order m of the tree).

3. Dynamic Growth: Automatically adjusts its height as keys are inserted or deleted.

4. Internal Node Storage: Keys and data are stored in internal and leaf nodes.

Merits:

 Efficient Searching: Reduces the number of disk accesses due to its balanced structure.

 Fast Updates: Insertion and deletion are efficient, maintaining balance dynamically.

 Supports Sequential Access: Keys are sorted, enabling range queries.

Demerits:

 Complex Structure: Managing internal nodes and balancing increases implementation complexity.

 Slower for Sequential Access: Requires traversing internal nodes to access all records.

B+ Trees

Properties:

1. Leaf Nodes Only for Data: All data records are stored in the leaf nodes; internal nodes store keys for
navigation.
100522729022
K.TEJESHWAR

2. Linked Leaf Nodes: Leaf nodes are linked sequentially for faster traversal.

3. Balanced Tree: Like B-trees, all leaf nodes are at the same level.

Merits:

 Fast Sequential Access: Linked leaf nodes allow quick traversal of all records.

 Efficient Range Queries: Easier and faster to scan a range of keys.

 Compact Internal Nodes: Internal nodes only store keys, leading to better utilization of memory.

Demerits:

 More Disk Access: For single key lookups, as traversal always reaches the leaf nodes.

 Higher Storage Requirement: Linked leaf nodes and separation of data from internal nodes use more
space.

Comparison:

 B-Trees are better for frequent insertions and deletions due to fewer structural changes.

 B+ Trees are preferred for range queries and sequential access due to linked leaf nodes.

Q) Transaction its properties and its states?


Transaction in DBMS

A transaction is a sequence of one or more database operations (like read, write, update, or delete) performed
as a single logical unit of work. A transaction must maintain the database's consistency, even in case of failures.

Properties of Transactions (ACID Properties)

1. Atomicity:

 A transaction is treated as a single, indivisible unit.

 Either all operations in the transaction are completed, or none are applied.

 Example: Transferring money from Account A to Account B must deduct from A and add to B; if
one fails, neither operation should occur.

2. Consistency:

 A transaction must transform the database from one consistent state to another.

 The integrity constraints of the database are maintained before and after the transaction.
100522729022
K.TEJESHWAR

 Example: If the total amount in all bank accounts is $10,000, this must remain true after the
transaction.

3. Isolation:

 Transactions are executed independently of one another.

 Intermediate states of a transaction must not be visible to other transactions.

 Example: While transferring money, another transaction should not see a partially updated bal-
ance.

4. Durability:

 Once a transaction is successfully committed, its changes are permanent, even in the case of a
system failure.

 Example: After a successful transfer, the updated balances must remain in the database even if
the system crashes.

States of a Transaction

1. Active:

 The transaction is currently being executed.

 Example: A series of operations (e.g., reading and writing data) is in progress.

2. Partially Committed:

 The transaction has completed its operations but has not yet been finalized.

 Example: All database operations are executed, but the transaction is waiting for a commit.

3. Committed:

 The transaction is successfully completed, and changes are saved permanently.

 Example: Money transfer is finalized, and both accounts reflect the updated balances.

4. Failed:

 The transaction cannot proceed due to an error (e.g., system failure or constraint violation).

 Example: A transfer fails if Account A does not have enough balance.

5. Aborted:

 The transaction is terminated, and all its operations are rolled back (undone).

 Example: After a failed transfer, balances return to their original state.


100522729022
K.TEJESHWAR

State Transition Diagram

1. Active → Partially Committed → Committed (Successful Transaction)

2. Active → Failed → Aborted (Failed Transaction)

Transactions ensure that database operations are consistent, reliable, and recoverable.

Q)serializability and its types?


Serializability in DBMS

Serializability is a concept used to ensure the correctness of concurrent transactions. It ensures that the outcome
of executing multiple transactions concurrently is the same as if the transactions were executed serially (one af-
ter the other).

Types of Serializability

1. Conflict Serializability

2. View Serializability

1. Conflict Serializability

A schedule (sequence of operations) is conflict serializable if it can be transformed into a serial schedule by
swapping non-conflicting operations.

Example:

 Transactions:

 T1: Reads and writes A.

 T2: Reads and writes A.

Schedule T1 (Read A) T2 (Read A) T2 (Write A) T1 (Write A)

 Conflict occurs because both T1 and T2 operate on A.

 By rearranging operations (e.g., complete T1 before T2), we can make it serial.

2. View Serializability

A schedule is view serializable if it produces the same final result as a serial schedule, even if operations cannot
be swapped like in conflict serializability.
100522729022
K.TEJESHWAR

Example:

 Transactions:

 T1: Writes A and reads B.

 T2: Reads A and writes B.

Schedule T1 (Write A) T2 (Read A) T2 (Write B) T1 (Read B)

 Final values of A and B are the same as in a serial schedule, so it is view serializable.

Comparison

Type Description Example Use Case

Conflict Ensures correctness by swapping non- Database systems requiring strict


Serializability conflicting operations. operation ordering.

View Ensures the same final result as a serial Systems where operation reordering
Serializability schedule. is not possible.

Conclusion

Serializability ensures that concurrent transactions maintain database consistency. Conflict


serializability focuses on operation reordering, while view serializability guarantees the same result regard-
less of reordering constraints. Both are essential for ensuring reliable and consistent transaction processing in a
DBMS.

Q) Hashing and its types?

1. Hashing in DBMS

Hashing is a technique used to map data to a fixed-size value called a hash key or hash code.
This mapping is done using a hash function, which helps in quick data retrieval, especially in
large datasets

Types of Hashing

1. Static Hashing

 Description: The hash table size is fixed, and the same hash function is used throughout the life-
time of the database.
 Structure: Keys are mapped to buckets using the hash function.
 Characteristics:
100522729022
K.TEJESHWAR

 Easy to implement.
 Performance degrades as the data grows beyond the table's capacity.
 Example:
 Hash Function: h(key) = key % 10
 Keys: {11, 22, 33}
 Buckets: Bucket 1 → 11, Bucket 2 → 22, Bucket 3 → 33.

2. Dynamic Hashing

 Description: The hash table size grows or shrinks dynamically as the dataset changes.
 Characteristics:
 Adapts to changes in data size.
 Reduces overflow issues.
 Example:
 Keys: {101, 202, 303}.
 When a bucket overflows, the table doubles in size, and data is redistributed using a new hash
function.

3. Extendible Hashing

 Description: A form of dynamic hashing where a directory structure points to hash buckets.
 Characteristics:
 Directory doubles in size when buckets overflow.
 Uses a binary representation of hash values.
 Example:
 Directory: 00 → Bucket 0, 01 → Bucket 1, 10 → Bucket 2.

4. Linear Hashing

 Description: A dynamic hashing technique that grows the table incrementally rather than dou-
bling its size.
 Characteristics:
 Avoids sudden large memory usage.
 Efficient for applications with gradual data growth.
 Example:
 If the hash table size is initially 4, it increases gradually as buckets overflow (e.g., 4 → 5 → 6).

5. Open Hashing (Chaining)

 Description: Uses linked lists to handle collisions within a bucket.


 Characteristics:
 Keys mapping to the same bucket are stored in a list.
100522729022
K.TEJESHWAR

 Flexible, as lists can grow dynamically.


 Example:
 Bucket 1: {21 → 31 → 41}.

Q)log based protocol in detail?


Log-Based Protocol in DBMS

A log-based protocol ensures database consistency and supports recovery from system failures by maintaining
a log of all database transactions. The log is a sequential record of all operations performed by transactions.

Key Features of Log-Based Protocols:

1. Write-Ahead Logging (WAL):

 The log must be written to stable storage before any changes are made to the database.

 Ensures durability by maintaining a record of changes even if the system crashes.

2. Types of Log Entries:

 <T, start>: Indicates the start of transaction T.

 <T, X, old_value, new_value>: Records changes made by transaction T to data item X.

 <T, commit>: Indicates successful completion of transaction T.

 <T, abort>: Indicates rollback of transaction T.

Log-Based Recovery Techniques:

1. Undo Logging:

 Reverts changes of incomplete transactions during recovery.

 Condition: Write log entry <T, X, old_value, new_value> before updating the database.

 Example:

 Log: <T1, X, 10, 20>.

 On failure, revert X to 10.

2. Redo Logging:

 Reapplies changes of committed transactions during recovery.

 Condition: Write log entry and commit <T, commit> before updating the database.

 Example:
100522729022
K.TEJESHWAR

 Log: <T1, X, 10, 20>, <T1, commit>.

 On failure, reapply changes to set X = 20.

3. Undo-Redo Logging:

 Combines both techniques:

 Undo changes of incomplete transactions.

 Redo changes of committed transactions.

Advantages:

 Provides a reliable mechanism for crash recovery.

 Ensures database consistency and durability.

 Helps maintain ACID properties (Atomicity and Durability).

Disadvantages:

 Additional overhead due to logging.

 Increased storage requirements for maintaining logs.

Log-based protocols are fundamental in DBMS to ensure data integrity and support effective recovery in multi-
transactional environments.

Q) Time stamp based protocol in detail?


Timestamp-Based Protocol in DBMS

The Timestamp-Based Protocol is a concurrency control method used in databases to ensure serializability of
transactions. It assigns a unique timestamp to each transaction and uses these timestamps to order the execution
of operations, ensuring that conflicting operations follow the order of their timestamps.

Key Concepts

1. Timestamp (TS):

 A unique identifier for each transaction, typically based on the system clock or a counter.

 Older transactions have smaller timestamps, while newer transactions have larger timestamps.

2. Two Types of Timestamps for Data Items:

 Read_TS(X): The largest timestamp of any transaction that has read X.

 Write_TS(X): The largest timestamp of any transaction that has written X.


100522729022
K.TEJESHWAR

Rules of Timestamp Protocol

The protocol ensures serializability by controlling read and write operations as follows:

1. Read Operation (T tries to read X):

 If TS(T) < Write_TS(X):

 This means T is trying to read a value of X that has already been overwritten by a more
recent transaction.

 Action: The transaction T is aborted and restarted with a new timestamp.

 Otherwise:

 T is allowed to read X.

 Update Read_TS(X) to max(Read_TS(X), TS(T)).

2. Write Operation (T tries to write X):

 If TS(T) < Read_TS(X):

 This means T is trying to write a value that may be read by an older transaction.

 Action: The transaction T is aborted and restarted.

 If TS(T) < Write_TS(X):

 This means T is trying to write a value of X that has already been written by a newer
transaction.

 Action: The transaction T is aborted and restarted.

 Otherwise:

 T is allowed to write X.

 Update Write_TS(X) to TS(T).

Example

Transactions:

 T1: Timestamp = 1

 T2: Timestamp = 2

Operations:

1. T1: Read(X):
100522729022
K.TEJESHWAR

 TS(T1) = 1 and Write_TS(X) = 0 (initial value).

 TS(T1) > Write_TS(X) → Allowed.

 Update Read_TS(X) = max(Read_TS(X), TS(T1)) = 1.

2. T2: Write(X):

 TS(T2) = 2 and Read_TS(X) = 1.

 TS(T2) > Read_TS(X) → Allowed.

 Update Write_TS(X) = TS(T2) = 2.

3. T1: Write(X):

 TS(T1) = 1 and Write_TS(X) = 2.

 TS(T1) < Write_TS(X) → Abort T1 and restart with a new timestamp.

Advantages

1. Ensures serializability by enforcing timestamp order.

2. Avoids deadlocks as transactions are aborted instead of waiting.

3. Simple and easy to implement in many systems.

Disadvantages

1. Cascading Abort: Aborting a transaction may lead to aborting dependent transactions.

2. Starvation: A transaction may be restarted repeatedly if it gets a smaller timestamp compared to others.

3. Overhead: Maintaining timestamps for transactions and data items adds complexity and requires addi-
tional storage.

Use Case

Timestamp-based protocols are suitable for applications with frequent read-write conflicts where deadlock pre-
vention is critical, but they may not be ideal for high contention environments due to the risk of starvation and
frequent aborts.

Q)ACID properties?
ACID Properties in DBMS

ACID is a set of properties that guarantee that database transactions are processed reliably and ensure the integ-
rity of the database, even in cases of system crashes, power failures, or errors during transaction processing.

The four ACID properties are:


100522729022
K.TEJESHWAR

1. Atomicity

 Definition: A transaction is an atomic unit of work. It is either fully completed or fully rolled back. If any
part of the transaction fails, the entire transaction is canceled, and the database remains unchanged.

 Example: If a money transfer operation involves two steps—deducting money from one account and
adding it to another—both steps must be completed. If one step fails, the entire transaction is rolled
back, and neither account is updated.

2. Consistency

 Definition: A transaction must take the database from one consistent state to another. After the transac-
tion completes, all database rules, constraints, and relationships must be satisfied, ensuring data integri-
ty.

 Example: A transaction transferring money from one account to another must preserve the total balance
across all accounts. If the system had a rule that no account could go below a minimum balance, the
transaction must not violate that rule.

3. Isolation

 Definition: Transactions should operate independently and not interfere with each other. The results of
a transaction should not be visible to other transactions until it is fully committed. This ensures that in-
termediate results of one transaction do not affect other concurrent transactions.

 Example: If two transactions are trying to transfer money between accounts simultaneously, one trans-
action should not see the changes made by the other until both are completed.

4. Durability

 Definition: Once a transaction is committed, its changes to the database are permanent, even in the
case of a system crash or failure. The system must ensure that committed data is saved to non-volatile
storage.

 Example: After successfully completing a transaction that transfers money between two accounts, the
changes (such as new balances) should persist even if the database crashes immediately after the com-
mit.

Summary Table of ACID Properties

Property Definition Example

A transaction is fully completed or fully If a transfer fails after deducting money, the sys-
Atomicity rolled back. tem reverts to the original state.
100522729022
K.TEJESHWAR

Property Definition Example

The database remains in a valid state A transfer must not violate account balance con-
Consistency before and after the transaction. straints or database rules.

Transactions do not interfere with each Transaction A’s changes should not be visible to
Isolation other. Transaction B until A is committed.

Committed transactions are permanent After a successful transfer, the changes (account
Durability and unaffected by crashes. balances) persist even if the system crashes.

Importance of ACID Properties

ACID properties are crucial in DBMS to ensure that transactions are executed in a reliable, consistent, and re-
coverable manner. This guarantees the integrity of the database and prevents issues such as data corruption or
loss during concurrent processing or system failures.

Q) Nomalization with example?


1. Normalization in DBMS

Normalization is the process of organizing the attributes and tables of a database to reduce redundancy and de-
pendency. The goal is to ensure that the data is stored in the most efficient way, maintaining data integrity and
reducing the likelihood of anomalies (insertion, update, and deletion anomalies).

Normalization involves dividing a database into two or more tables and defining relationships between the ta-
bles. It follows a series of stages called normal forms (NF), each of which eliminates specific types of redundan-
cy.

Normal Forms

First Normal Form (1NF):

 A table is in 1NF if:

 All columns contain atomic (indivisible) values.

 Each record is unique (no duplicate rows).

 Each field contains only a single value, not a set or list.

Example: Consider the following table (before 1NF):

Student_ID Student_Name Subjects


100522729022
K.TEJESHWAR

Student_ID Student_Name Subjects

1 Alice Math, Science

2 Bob History, Math

In this case, the Subjects column contains multiple values, violating 1NF.

After 1NF (each subject in a separate row):

Student_ID Student_Name Subject

1 Alice Math

1 Alice Science

2 Bob History

2 Bob Math

Second Normal Form (2NF):

 A table is in 2NF if:

 It is in 1NF.

 It has no partial dependencies (i.e., non-prime attributes must depend on the


whole of every candidate key).

Example: Consider the following table (before 2NF):

Student_ID Course_ID Student_Name Course_Name

1 101 Alice Math

1 102 Alice Science

2 101 Bob Math

2 103 Bob History

Here, the non-prime attribute Student_Name depends only on Student_ID, and Course_Name depends only
on Course_ID. These are partial dependencies.

After 2NF (remove partial dependencies):

 Students Table:

Student_ID Student_Name
100522729022
K.TEJESHWAR

Student_ID Student_Name

1 Alice

2 Bob

 Courses Table:

Course_ID Course_Name

101 Math

102 Science

103 History

 Enrollments Table (to maintain relationship):

Student_ID Course_ID

1 101

1 102

2 101

2 103

Third Normal Form (3NF):

 A table is in 3NF if:

 It is in 2NF.

 It has no transitive dependencies (i.e., non-prime attributes must depend only


on candidate keys, not on other non-prime attributes).

Example: Consider the following table (before 3NF):

Student_ID Student_Name Department Department_Head

1 Alice CS Dr. Smith

2 Bob Math Dr. Johnson

Here, Department_Head depends on Department, not directly on Student_ID, creating a transitive dependency.

After 3NF (remove transitive dependencies):

 Students Table:
100522729022
K.TEJESHWAR

Student_ID Student_Name Department

1 Alice CS

2 Bob Math

 Departments Table:

Department Department_Head

CS Dr. Smith

Math Dr. Johnson

Boyce-Codd Normal Form (BCNF):

 A table is in BCNF if:

 It is in 3NF.

 For every functional dependency, the determinant (the attribute on the left
side) must be a superkey.

Example: Consider the following table (before BCNF):

Student_ID Course_ID Instructor

1 101 Prof. A

1 102 Prof. B

2 101 Prof. A

Here, the Instructor depends on Course_ID, not just on the primary key (Student_ID, Course_ID), so it violates
BCNF.

After BCNF:

 Enrollments Table:

Student_ID Course_ID

1 101

1 102

2 101

 Courses Table:
100522729022
K.TEJESHWAR

Course_ID Instructor

101 Prof. A

102 Prof. B

Summary of Normal Forms

Normal Form Requirement

1NF Eliminate repeating groups, store atomic values only.

2NF Eliminate partial dependencies (depends on the whole candidate key).

3NF Eliminate transitive dependencies (non-prime attributes depend only on the key).

BCNF Every determinant must be a superkey.

Benefits of Normalization

 Reduces Redundancy: Redundant data is minimized, reducing storage and making updates
easier.

 Prevents Anomalies: Insertion, update, and deletion anomalies are avoided.

 Improves Data Integrity: Ensures consistency and accuracy of the data.

Normalization helps design efficient and well-organized databases by breaking down large, complex tables into
smaller ones, maintaining relationships among them through foreign keys.

Q) Features of a good relation design?


Features of a Good Relation Design in DBMS

A good relation design in a database ensures efficiency, minimizes redundancy, and supports the maintenance
of data integrity. Here are the key features of a well-designed relational database schema:

1. Minimal Redundancy

 Definition: Avoids duplication of data across the database.

 Feature: Each piece of information is stored only once, reducing storage requirements and the risk of
inconsistencies.
100522729022
K.TEJESHWAR

 Example: Storing student details in a separate Students table rather than duplicating the same details in
every enrollment record.

2. Data Integrity

 Definition: Ensures that the data is accurate and consistent.

 Feature: Relational designs use constraints (such as primary keys, foreign keys, and unique constraints)
to maintain data integrity.

 Example: A foreign key constraint between a Course table and a Student table ensures that only valid
students can be enrolled in a course.

3. Avoidance of Anomalies

 Definition: Prevents issues like insertion, update, and deletion anomalies that can occur due to poor
design.

 Feature: Proper normalization and normalization to higher normal forms (like 3NF or BCNF) help elimi-
nate these anomalies.

 Example: An insertion anomaly occurs if a student can only be enrolled in a course if they have a
course-related detail. This can be avoided by separating course-related data into its own table.

4. Flexibility and Scalability

 Definition: The design should easily accommodate future growth or changes in requirements without
needing major redesigns.

 Feature: A good design uses normalization and modularity, ensuring that new entities or relationships
can be added without disrupting existing structures.

 Example: Adding new subjects or student records can be done without altering the existing course
structure, as subjects are in a separate table.

5. Efficient Data Retrieval

 Definition: The design should support quick and efficient queries, ensuring minimal computational
overhead.

 Feature: Proper indexing and use of primary and foreign keys help optimize query performance.

 Example: Creating an index on the Student_ID column helps retrieve student information faster.

6. Clear Relationship Representation

 Definition: Relationships between entities (tables) are clearly defined.

 Feature: The design should have clear foreign key relationships and should reflect the business rules or
real-world entities.
100522729022
K.TEJESHWAR

 Example: A Student_Course table clearly represents the many-to-many relationship between students
and courses.

7. Support for ACID Properties

 Definition: The relation design should ensure that transactions can adhere to ACID (Atomicity, Con-
sistency, Isolation, Durability) properties.

 Feature: Constraints and proper schema design ensure that operations on data remain consistent, even
in case of failures.

 Example: A Transaction table that keeps track of financial operations should be designed to guarantee
that each transaction is atomic and consistent.

8. Use of Appropriate Data Types

 Definition: Each column in the table should have an appropriate data type to minimize wasted space
and ensure data accuracy.

 Feature: Choosing the correct data type for each attribute ensures that data is stored efficiently and val-
idation rules are applied.

 Example: Using INT for age or VARCHAR for names ensures efficient storage.

9. Avoiding Complex Joins

 Definition: The design should minimize the need for complex joins, which can decrease query perfor-
mance.

 Feature: The schema should be normalized to an extent that makes joins simpler and more efficient.

 Example: Instead of storing all data in a single, overly large table, breaking the data into logically con-
nected smaller tables minimizes the need for complex joins.

10. Consistent Naming Conventions

 Definition: Naming conventions should be clear, consistent, and follow a logical pattern.

 Feature: Tables, columns, and constraints should have descriptive names that reflect the data they
store.

 Example: Naming a column Student_Name in the Students table is clearer than just Name, as it defines
the context.

Q) Concurancy controls and its problems?


Concurrency Control in DBMS

Concurrency Control is a mechanism used in database management systems (DBMS) to manage simultaneous
transactions in a way that ensures the consistency and correctness of the database. Since multiple transactions
100522729022
K.TEJESHWAR

may execute concurrently, proper control is required to prevent conflicts and ensure that the database remains
in a consistent state.

The main goal of concurrency control is to achieve serializability, which means that the concurrent execution of
transactions should result in the same state as if the transactions were executed serially (one after another).

Concurrency Control Techniques

1. Lock-Based Protocols

 Definition: Transactions acquire locks on data items before accessing them. Locks prevent
other transactions from accessing the same data item in conflicting ways.

 Types of Locks:

 Shared Lock (S-lock): Allows a transaction to read a data item, but prevents other
transactions from modifying it.

 Exclusive Lock (X-lock): Allows a transaction to both read and write a data item, and
prevents others from accessing it.

 Two-Phase Locking (2PL): A protocol that requires each transaction to acquire all its locks be-
fore it releases any. This guarantees serializability but can lead to deadlocks.

2. Timestamp-Based Protocols

 Definition: Every transaction is assigned a unique timestamp. The database system uses these
timestamps to determine the order of transactions and ensures that conflicting transactions are
executed in timestamp order.

3. Optimistic Concurrency Control (OCC)

 Definition: In OCC, a transaction is allowed to execute without locking resources, but before
committing, the system checks for conflicts with other transactions. If a conflict is detected, the
transaction is rolled back.

 Phases:

1. Read Phase: Transactions read the data and perform operations.

2. Validation Phase: The system checks if the transaction can commit without causing conflicts.

3. Write Phase: The transaction writes the changes to the database if no conflicts are found.

4. Multi-Version Concurrency Control (MVCC)

 Definition: MVCC allows multiple versions of data to exist. Each transaction reads a snapshot
of the data at the time it started, and transactions modify data without blocking others, creating
new versions of the data.

 Example: Systems like PostgreSQL use MVCC to allow concurrent reads and writes.
100522729022
K.TEJESHWAR

Concurrency Control Problems

When transactions execute concurrently, several problems can arise due to conflicting operations on the same
data. The main problems include:

1. Lost Update Problem

 Definition: This occurs when two or more transactions update the same data item, and the updates are
lost due to conflicts.

 Example: Transaction T1 reads a data item X, modifies it, and writes it back. Meanwhile, transaction T2
also reads the same X, modifies it, and writes it back. The update from T1 is lost, and only T2's changes
are retained.

2. Temporary Inconsistent State

 Definition: A transaction reads data that is in an intermediate or inconsistent state, which can lead to in-
correct results.

 Example: Transaction T1 reads a record of an account balance, performs some calculations, and writes
the result. Meanwhile, transaction T2 updates the account balance, leading T1 to perform calculations
based on outdated data.

3. Uncommitted Data (Dirty Read)

 Definition: A transaction reads data that has been modified by another transaction but not yet commit-
ted. If the second transaction is rolled back, the first transaction will have read invalid or "dirty" data.

 Example: Transaction T1 reads an account balance, while T2 is updating that balance but hasn’t commit-
ted yet. If T2 aborts, T1 will have used invalid data in its operations.

4. Non-Serializable Schedules

 Definition: A schedule of transactions (the order in which transactions execute) is non-serializable if it


does not result in the same final database state as some serial execution of those transactions.

 Example: Transaction T1 reads a data item, and T2 writes it. If the two transactions are interleaved in a
non-serializable order, the final result might not be the same as if T1 and T2 had executed one after the
other.

Deadlock in Concurrency Control

Deadlock occurs when two or more transactions are waiting for each other to release locks, causing them to be
stuck indefinitely.

Deadlock Prevention and Detection:

 Deadlock Prevention: Systems can prevent deadlock by using techniques like acquiring all locks be-
fore executing a transaction or using a timeout approach.

 Deadlock Detection: Periodically checks for cycles in the wait-for graph to identify and resolve dead-
locks, often by aborting one or more transactions.
100522729022
K.TEJESHWAR

Q) Non serializability in detial?


Non-Serializability in DBMS

Non-serializability refers to a situation where the execution of concurrent transactions in a database does not
produce the same final result as some serial execution of those transactions. A serial execution is one where the
transactions are executed one after the other without any interleaving. When transactions are interleaved (exe-
cuted concurrently), the order in which operations occur can affect the final state of the database. If the interleav-
ing leads to a final state that cannot be reproduced by a serial execution, the schedule is considered non-
serializable.

Non-serializability is a critical problem in database management systems, as it may lead to inconsistent or in-
correct results. A schedule of transactions is said to be serializable if the result of executing the transactions in a
specific order is equivalent to executing them serially (one by one).

Types of Non-Serializable Schedules

There are two main types of non-serializability in DBMS:

1. Conflict Serializable

2. View Serializable

A schedule can be non-serializable if it is neither conflict-serializable nor view-serializable.

1. Conflict Serializable Schedules

A conflict serializable schedule is one where the transactions can be reordered (without changing the relative
order of conflicting operations) to form a serial schedule. Conflicting operations are operations that access the
same data item and at least one of them is a write.

 Conflict Operations: Two operations conflict if they meet all the following conditions:

1. They belong to different transactions.

2. They operate on the same data item.

3. At least one of them is a write operation.

Example of conflicting operations:

 T1: Write(A) and T2: Read(A)

 T1: Write(A) and T2: Write(A)

If such conflicting operations are interleaved incorrectly, it could result in non-serializability.

2. View Serializable Schedules


100522729022
K.TEJESHWAR

A view serializable schedule is one where, if we look at the values of the data items read and written by the
transactions, the schedule can be transformed into a serial schedule that produces the same final results.

View serializability is a broader concept than conflict serializability because it allows for a larger set of inter-
leaved operations to still produce the same final result.

Non-Serializable Schedule Example

Let's consider an example where we have two transactions T1 and T2:

 Transaction T1:

1. Read(A)

2. Write(A)

 Transaction T2:

1. Read(A)

2. Write(A)

Interleaved Schedule:

Step Operation Transaction

1 Read(A) T1

2 Write(A) T1

3 Read(A) T2

4 Write(A) T2

In this schedule:

 T1 reads and writes A, followed by T2 reading and writing A.

 This schedule is non-serializable because the final state of A after these operations depends on the in-
terleaving. If T1 is executed before T2, the result will be different than if T2 is executed before T1, lead-
ing to non-serializability.

If we were to execute these transactions serially:

 Serial Schedule 1: T1 → T2 (T1 performs all its operations before T2).

 Serial Schedule 2: T2 → T1 (T2 performs all its operations before T1).

In this example, the interleaved operations cannot be transformed into a serial schedule without changing the
outcome, hence non-serializability occurs.
100522729022
K.TEJESHWAR

Problems Caused by Non-Serializable Schedules

1. Inconsistent Data: Non-serializable schedules can lead to the database being in an inconsistent state.
For instance, one transaction may read an intermediate state of data that is not yet committed by another
transaction, leading to incorrect results.

2. Lost Updates: A non-serializable schedule may cause updates from one transaction to overwrite the up-
dates of another transaction, leading to lost data.

3. Anomalies in Concurrent Transactions: Operations in a non-serializable schedule may lead to anoma-


lies like dirty reads, non-repeatable reads, and phantom reads, which can break the integrity of the
data.

Example of Non-Serializable Schedule with Anomalies

Let’s look at a schedule with transactions T1 and T2.

 Transaction T1:

 Read(A)

 Write(A) (set A = 50)

 Transaction T2:

 Read(A)

 Write(A) (set A = 100)

Interleaved Schedule:

Step Operation Transaction

1 Read(A) T1

2 Read(A) T2

3 Write(A=50) T1

4 Write(A=100) T2

 In this case, both transactions read the value of A before either transaction writes back their value.

 T1 reads A as some value (let’s assume A = 20), updates A to 50, and commits.

 T2 also reads A as 20, but updates A to 100 and commits.

 The final value of A is 100, which is different from what would happen if the transactions were serialized.

Possible Serialized Schedules:

 Serial Schedule 1: T1 → T2
100522729022
K.TEJESHWAR

 T1 reads A = 20, sets A = 50.

 T2 reads A = 50 (the updated value), sets A = 100.

 Final A = 100.

 Serial Schedule 2: T2 → T1

 T2 reads A = 20, sets A = 100.

 T1 reads A = 100, sets A = 50.

 Final A = 50.

In this case, the non-serializable schedule yields a different result than either serial schedule, demonstrating
non-serializability.

Conclusion

Non-serializability is a significant issue in concurrent transaction processing. It occurs when the interleaving of
transaction operations leads to results that are inconsistent with some serial execution. To avoid non-
serializability, DBMS uses concurrency control techniques, such as locking protocols, timestamp-based proto-
cols, and optimistic concurrency control, that ensure serializability or manage conflicts in a way that maintains
data integrity. Proper isolation levels are critical to ensuring that transactions do not interfere with each other in
ways that cause non-serializability.

You might also like