DBMS Notes by Tarun
DBMS Notes by Tarun
DBMS continues to evolve with AI, cloud computing, and real-time analytics!
5. Key FAQs
• First DBMS: IBM IMS (1960s), used in NASA’s Apollo program.
• Relational Model: Introduced by Edgar F. Codd (1970).
• Pre-relational models: Hierarchical & Network models.
• SQL Standardization: Became industry standard (1980s) by ANSI.
Applications of DBMS
Reservations (Railway/Airline) – Manages bookings, schedules, and transactions.
Library – Tracks books, issues, and returns.
Banking & Finance – Handles accounts, transactions, and security.
Education & HR – Manages student records, payroll, and recruitment.
E-Commerce & Credit Cards – Tracks orders, payments, and recommendations.
Social Media & Web – Stores user data, messages, and activity logs.
Telecom – Manages calls, billing, and subscriptions.
Healthcare – Stores patient records, prescriptions, and billing.
Security – Ensures encryption and access control.
Manufacturing – Tracks inventory, production, and supply chain.
Key Takeaways
• 1-Tier: Best for standalone applications.
• 2-Tier: Good for small businesses or internal applications.
• 3-Tier: Ideal for large-scale, multi-user applications with better security and performance.
Backup &
No built-in mechanism Provides automated backup and recovery
Recovery
Query Processing No efficient query system Uses SQL for complex queries
Data Sharing Hard to share data across systems Easy data sharing due to centralization
Key Takeaways
Use File System when you need simple file storage (documents, images, media).
Use DBMS for handling structured data, complex queries, and multiple user access.
DBMS ensures data integrity, security, and scalability, unlike the file system.
Components of ER Diagram
1. Entities: Objects with physical or conceptual existence (e.g., person, company).
2. Attributes: Properties that define an entity (e.g., name, age).
3. Relationships: Associations between entities (e.g., student enrolled in a course).
What is an Entity?
1. Definition: An object with physical (e.g., person) or conceptual (e.g., company) existence.
2. Entity Set: A collection of entities of the same type (e.g., all students).
o Represented in ER diagrams, but individual entities (rows) are not.
Types of Entities
1. Strong Entity:
o Has a key attribute (primary key) for unique identification.
o Does not depend on other entities.
o Represented by a rectangle.
2. Weak Entity:
o Lacks a key attribute and depends on a strong entity for identification.
o Represented by a double rectangle.
o Example: Dependents of an employee.
Types of Attributes
1. Key Attribute:
o Uniquely identifies an entity (e.g., Roll_No).
o Represented by an oval with underlying lines.
2. Composite Attribute:
o Made up of multiple attributes (e.g., Address = Street + City).
o Represented by an oval containing smaller ovals.
3. Multivalued Attribute:
o Can have multiple values (e.g., Phone_No).
o Represented by a double oval.
4. Derived Attribute:
o Can be derived from other attributes (e.g., Age from DOB).
o Represented by a dashed oval.
Cardinality in Relationships
1. One-to-One (1:1):
o Each entity in one set relates to only one entity in another set.
o Example: One person marries one person.
2. One-to-Many (1:M ):
o One entity relates to multiple entities in another set.
o Example: One department has many doctors.
3. Many-to-One (M:1):
o Many entities relate to one entity in another set.
o Example: Many students enroll in one course.
4. Many-to-Many (M:N ):
o Entities in both sets relate to multiple entities.
o Example: Students enroll in multiple courses, and courses have multiple students.
Participation Constraints
1. Total Participation:
o Every entity in the set must participate in the relationship.
o Represented by a double line in ER diagrams.
2. Partial Participation:
o Entities may or may not participate in the relationship.
o Example: Some courses may not have any students enrolled.
Conclusion
• The ER Model is a powerful tool for designing databases.
• It visually represents entities, attributes, and relationships, making it easier to understand and
organize data.
Conclusion
Structural constraints in ER modeling play a crucial role in defining relationships between entities and
ensuring database integrity. By combining cardinality constraints and participation constraints, designers
can create well-structured and efficient databases. These constraints help prevent data inconsistencies and
optimize schema design, leading to improved query performance and schema evolution.
2. Entity Type
• A category or class of similar entities that share the same attributes.
• Defines what attributes an entity will have in a database.
• Represented as a table schema in a relational database.
• Example: The "Student" entity type includes attributes like Student_ID, Name, Age.
3. Entity Set
• The collection of all entities of a particular entity type at a given time.
• Represents the data stored in the table at a specific moment.
• Can grow or shrink as entities are added or removed.
• Example: All students currently enrolled in a university form the Student entity set.
Comparison Table
Definition A real-world object A category of similar entities A collection of all entities of a type
Change Over
Fixed identity Fixed structure Can grow or shrink
Time
In short:
✔ Entity = Single record (row in a table)
✔ Entity Type = Table schema (structure of records)
✔ Entity Set = Collection of records (all rows in a table)
• Prevents invalid data entry (e.g., negative salaries or ages below 18).
Comparison Table
Key Takeaways
Constraints in DBMS
1. Primary Key Constraint: Ensures that each record in a table is unique (e.g., "StudentID" for a
"Students" table).
2. Foreign Key Constraint: Links a table to another by referencing its primary key, maintaining
referential integrity (e.g., linking "OrderID" in an "OrderDetails" table to "Orders" table).
3. Unique Constraint: Ensures all values in a column are different (e.g., "Email" column in a "Users"
table).
4. Not Null Constraint: Ensures that a column cannot have null values (e.g., "LastName" column in an
"Employees" table).
5. Check Constraint: Ensures that values meet a specific condition (e.g., "Age" must be over 18).
6. Default Constraint: Provides a default value for a column if no value is specified (e.g., default
"Pending" status in an "Orders" table).
Derived Operators
1. Natural Join (⋈): Combines tables based on common attributes with matching values.
o Example: EMP ⋈ DEPT where Dept_Name is common between EMP and DEPT.
o Purpose: Combines related data using shared attributes.
2. Conditional Join: Similar to Natural Join, but allows for custom conditions such as >=, <, or ≠.
o Example: Join R and S where R.Marks >= S.Marks.
o Purpose: More flexible join conditions.
Relational Calculus
While Relational Algebra is procedural, Relational Calculus is non-procedural. It describes what data is
required but not how to obtain it. There are two types:
• Tuple Relational Calculus (TRC)
• Domain Relational Calculus (DRC)
Conclusion
Relational Algebra may seem theoretical, but it plays a critical role in database query design and
optimization. By understanding its operators, you can break down complex queries into simpler ones and
efficiently retrieve and manipulate data in relational databases. Whether you're working with selection,
projection, or joins, these operators are essential tools for anyone managing a relational database.
FAQs
• What is the difference between Relational Algebra and SQL?
o Relational Algebra is procedural (it tells how to retrieve data), whereas SQL is declarative (it
tells what data to retrieve).
• Why is Relational Algebra important?
o It forms the theoretical foundation for relational databases and query optimization.
• Can Relational Algebra handle complex queries?
o Yes, by combining basic operators, complex queries can be constructed.
• What is meant by “union-compatible” relations?
o Relations are union-compatible if they have the same number of attributes with matching
data types.
• How does the Join operator work in Relational Algebra?
o It combines rows based on a specified condition, usually involving a common attribute.
1. Intersection (∩)
• Returns common tuples between two relations.
• Relations must be union-compatible (same attributes and domains).
2. Conditional Join (⋈ₓ)
• Joins two relations based on any condition, not just equality.
• Uses selection and cross-product.
3. Equijoin (⋈)
• A specific Conditional Join based on equality.
4. Natural Join (⋈)
• Automatically joins two relations on attributes with the same name.
• Duplicate columns are removed.
5. Left Outer Join (⟕)
• Returns all tuples from the left relation, even if no match exists.
• Unmatched tuples in the right relation get NULL values.
6. Right Outer Join (⟖)
• Returns all tuples from the right relation, even if no match exists.
• Unmatched tuples in the left relation get NULL values.
7. Full Outer Join (⟗)
• Returns all tuples from both relations.
• If no match is found, missing attributes are filled with NULL.
8. Division (÷)
• Finds tuples in A that are related to all tuples in B.
• Used for "For All" queries.
1. Join Operation
A join operation is used to combine data from two or more tables based on a common column or
condition. The result of a join is a single result set containing columns from all the involved tables.
• Types of Joins:
o INNER JOIN: Returns rows where there is a match in both tables.
o LEFT JOIN (OUTER JOIN): Returns all rows from the left table, and matching rows from the
right table. If no match, returns NULL for columns from the right table.
o RIGHT JOIN (OUTER JOIN): Similar to LEFT JOIN but returns all rows from the right table.
o NATURAL JOIN: Automatically joins tables based on columns with the same name and
compatible data types.
Example of an Inner Join:
Let's say we have two tables:
Table1 (ID, Name):
ID Name
1 John
2 Sarah
3 David
ID Address
SQL Query:
SELECT Table1.ID, Table1.Name, Table2.Address
FROM Table1
INNER JOIN Table2
ON Table1.ID = Table2.ID;
Result:
ID Name Address
Explanation: The query combines rows from both tables based on the common column ID.
ID Name
1 John
2 Sarah
3 David
ID Address
SQL Query:
SELECT Name
FROM Table1
WHERE ID IN (SELECT ID FROM Table2);
Result:
Name
John
Sarah
Explanation: The subquery (SELECT ID FROM Table2) retrieves IDs from Table2, and the outer query selects
the names of the people in Table1 whose IDs match those returned by the subquery.
Combines rows from multiple tables based Query within a query; results of inner query
Definition
on a condition. used by outer query.
Often more efficient, especially for large May be slower for large datasets, as the
Performance
datasets. subquery is executed for each row.
Less flexible in cases requiring multiple More flexible for advanced conditions or
Flexibility
conditions or nested logic. operations on subsets of data.
Joins may require fetching entire tables from Nested queries can reduce data
Distributed
multiple locations, leading to higher transferred, as only relevant data is fetched
Databases
overhead. from each location.
In this case:
• STUD_NO → STUD_NAME, STUD_NO → STUD_PHONE, STUD_NO → STUD_STATE, etc., hold true
because STUD_NO uniquely identifies these attributes.
5. How to Find Candidate Keys and Super Keys Using Attribute Closure
• Super Key: A set of attributes that can uniquely identify a tuple in a relation. If the closure of an
attribute set contains all attributes in the relation, the set is a super key.
• Candidate Key: A minimal super key (i.e., no proper subset of the attribute set can uniquely identify
all attributes).
Steps to Find Candidate Keys:
1. Find the closure of different attribute sets.
2. The attribute set whose closure contains all the attributes of the relation is a super key.
3. If no subset of the set can be a super key, it is a candidate key.
Example: For STUD_NO+ = {STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_COUNTRY,
STUD_AGE}, STUD_NO is a candidate key.
Prime Attribute: Attributes that are part of any candidate key.
• In the STUDENT relation, STUD_NO is a prime attribute because it is part of the candidate key.
Non-Prime Attribute: Attributes that are not part of any candidate key.
• In the STUDENT relation, STUD_NAME, STUD_PHONE, etc., are non-prime attributes.
Conclusion
Functional Dependency and Attribute Closure are fundamental concepts in database design, helping
ensure data consistency, integrity, and efficiency. By understanding these tools, one can:
• Identify candidate keys and super keys.
• Determine the closure of attribute sets.
• Optimize database queries and designs for better performance.
➤ Prime Attribute
An attribute that is part of a candidate key (i.e., it contributes to uniquely identifying records).
Example: In a STUDENT table with (Roll_No, Name, Age), if Roll_No is the candidate key, then Roll_No is
a prime attribute.
➤ Non-Prime Attribute
An attribute that is not part of any candidate key.
Example: If Roll_No is the candidate key, then Name and Age are non-prime attributes.
Example Table
101 Ram 18
102 Sita 19
➤ Notation
X→Y
This means:
• If two tuples have the same value for X, they must have the same value for Y.
• X is called the determinant, and Y is the dependent attribute.
Example:
• Consider a STUDENT table:
101 Ram 18 CS
102 Sita 19 IT
103 Amit 18 CS
X → Y is trivial if Y is a subset of X.
Example: {Roll_No, Name} → Name
• Since Name is already part of {Roll_No, Name}, it is trivial.
General Rule
• A functional dependency X → Y is trivial if X already contains Y.
General Rule
• A dependency is non-trivial if X does not contain Y.
General Rule
• If removing any part of X breaks the dependency, it is fully functional.
Why is it a problem?
• Leads to redundancy (same name repeated multiple times).
• Violates 2NF (Second Normal Form).
Why is it a problem?
• Leads to redundancy.
• Violates 3NF (Third Normal Form).
Why is it a problem?
• Leads to data redundancy.
• Violates 4NF (Fourth Normal Form).
The closure of a set of functional dependencies is the complete set of dependencies that can be
logically inferred.
Steps to Find Closure (F⁺)
1. Start with a given set of attributes X.
2. Apply all functional dependencies iteratively.
3. Find all attributes that can be derived from X.
Example:
Given:
• A→B
• B→C
Find A⁺ (Closure of A):
• A → B (Given)
• B → C (Since B is known, we get C)
• So, A⁺ = {A, B, C}
1⃣ Reflexivity: If Y ⊆ X, then X → Y
• Example: {Roll_No, Name} → Name
2⃣ Augmentation: If X → Y, then X, Z → Y, Z
• Example: If Roll_No → Name, then {Roll_No, Age} → {Name, Age}
6. Key Takeaways
Functional dependencies are critical for designing an efficient and well-structured database.
This violates 1NF because the "Writer" attribute is multi-valued. To make it compliant with 1NF, the table
should be restructured so that each writer gets its own row:
Book ID Writer
101 John
101 Sarah
101 Jack
Conclusion:
1NF lays the foundation for a well-structured database by enforcing that each column has atomic
(indivisible) values, reducing redundancy and making the data easier to manage.
FAQs on First Normal Form (1NF)
• What does 1NF mean? 1NF ensures that a database table contains only atomic (indivisible) values
and that all columns have unique names, facilitating easy and consistent data management.
• What is the significance of 1NF in database design? Implementing 1NF is crucial because it
removes redundant data and ensures that tables are structured in a way that supports data
integrity, efficient queries, and operations.
• What is the first normal form (1NF)? 1NF guarantees that there are no repeating groups within
rows and that all columns contain atomic, indivisible values, ensuring a basic level of data
consistency and organization in the database.
This explanation of First Normal Form (1NF) emphasizes its role in organizing and structuring data
efficiently to eliminate redundancy and maintain consistency. It also highlights the importance of following
simple rules to make the database more manageable and query-friendly.
101 C1 1000
101 C2 2000
102 C1 1000
Why BCNF?
Even though a relation might be in 3NF, there could still be cases where a functional dependency exists
where the left-hand side isn't a superkey. This can lead to redundancy and anomalies. BCNF eliminates this
problem by ensuring all dependencies are properly associated with superkeys.
Key Concepts:
• Superkey: A set of one or more attributes that can uniquely identify a record in a relation.
• Candidate Key: A minimal superkey (i.e., no subset of the key can uniquely identify records).
• Non-prime Attribute: Attributes that are not part of any candidate key.
Advantages of BCNF:
1. Reduces Redundancy: Eliminates data duplication by ensuring that functional dependencies
depend on superkeys.
2. Improves Data Integrity: By removing unnecessary dependencies, BCNF enhances the consistency
and integrity of the database.
3. Prevents Update Anomalies: BCNF ensures there are no situations where updates might result in
inconsistencies, like having multiple entries for the same data.
Conclusion
BCNF is a more robust normalization technique compared to 3NF, addressing potential redundancy issues
caused by non-superkey determinants. While it may not always be feasible to apply BCNF without losing
dependency preservation, it remains a powerful tool for ensuring a consistent, efficient, and less redundant
database design.
Why 3NF?
The primary goal of 3NF is to eliminate transitive dependencies. These are cases where one non-prime
attribute depends on another non-prime attribute, potentially leading to data redundancy and
inconsistency. By removing these dependencies, 3NF ensures a more efficient, consistent, and reliable
database design.
Key Concepts:
• Superkey: A set of one or more attributes that uniquely identifies a record in a relation.
• Candidate Key: A minimal superkey, meaning no proper subset of it can uniquely identify records.
• Prime Attribute: An attribute that is part of any candidate key.
• Non-prime Attribute: An attribute that is not part of any candidate key.
Advantages of 3NF:
1. Reduces Redundancy: By eliminating transitive dependencies, 3NF helps in removing unnecessary
data duplication.
2. Improves Data Integrity: 3NF ensures that the data in the database remains consistent and free
from anomalies.
3. Flexibility in Queries: With fewer dependencies, queries become simpler and more efficient.
4. Minimizes Update Anomalies: Eliminating transitive dependencies reduces the chances of
inconsistent updates.
Conclusion
Third Normal Form (3NF) is an important normalization step that builds on 1NF and 2NF by eliminating
transitive dependencies. By ensuring that non-prime attributes only depend on superkeys or are part of
candidate keys, 3NF helps in reducing redundancy, improving data integrity, and maintaining a consistent
and efficient database design. However, for even stricter designs, higher normal forms like BCNF may be
considered.
Example of Redundancy
Consider a table storing details about students, including attributes like student ID, name, college name,
course, and rank. The following table shows how data redundancy can appear:
• Notice that College, Course, and Rank attributes are repeated for every student, which creates
unnecessary redundancy.
Problems Caused by Redundancy
1. Insertion Anomaly
• Definition: This occurs when inserting a new record into the database requires unnecessary
additional data to be inserted.
• Example: If a student’s course is undecided, the student's record cannot be inserted without
assigning a course. This would prevent the insertion of incomplete records.
2. Deletion Anomaly
• Definition: Deleting a record can unintentionally remove useful data.
• Example: If a student’s record is deleted, the college information might also be deleted, which
should not happen as the college's data is independent of the student's record.
3. Updation Anomaly
• Definition: Updates to data may need to be applied multiple times across the database, leading to
inconsistencies.
• Example: If the college rank changes, every record for students at that college must be updated.
Failing to do so would leave some records with outdated rank data.
Conclusion
Redundancy in databases is a common issue that can cause data inconsistencies, higher storage
requirements, performance degradation, and security risks. The best way to handle redundancy is through
normalization, which organizes data to eliminate unnecessary duplication and improve database efficiency
and integrity.
All functional dependencies of the original relation The set of FDs in subrelations
Dependency
are enforceable on the decomposed subrelations should cover all FDs of the original
Preservation
without needing to join them. relation.
Conclusion
• Lossless Join ensures that no data is lost or extraneous data added during decomposition. It is
essential for maintaining the correctness of the database.
• Dependency Preserving Decomposition allows all integrity constraints to be enforced locally within
each subrelation, making the database easier to maintain and efficient in operation.
Both properties are crucial in the normalization process to achieve a database design that is both efficient
and maintains data integrity.
Denormalization in Databases
Denormalization is an optimization technique used in databases to improve query performance, primarily
by reducing the number of joins required during querying. However, this technique comes at the cost of
adding redundancy to the database, which can lead to increased maintenance complexity.
What is Denormalization in Databases?
Denormalization involves adding redundant data to one or more tables after a database has been
normalized. This is done to avoid the performance overhead of joining multiple normalized tables during
query execution. It is not a reversal of normalization but an optimization technique applied to make
databases more efficient for read-heavy applications.
In a normalized database, the goal is to minimize redundancy by splitting data into smaller, related tables.
For example, in a normalized schema, a Courses table and a Teachers table might store the teacher's name
and ID separately. To get a list of courses with teacher names, a join would be required.
In a denormalized schema, some of this information may be combined into one table, such as having both
the course and teacher's information in the same table to speed up query execution, albeit at the cost of
redundancy.
Steps to Denormalization
1. Unnormalized Table:
In this stage, all data is stored in one large table with significant redundancy. For example, student
names and class information could appear multiple times.
2. Normalized Structure:
In a normalized schema, this data is split into smaller, related tables to minimize redundancy and
avoid update anomalies. Each table now represents a specific aspect, such as students or classes.
3. Denormalized Table:
To improve query performance, we can combine the related tables into a single table. This removes
the need for complex joins when fetching data, improving performance for read-heavy systems.
Denormalization vs. Normalization
• Normalization: Focuses on removing redundancy, improving data integrity, and ensuring efficient
storage. It splits data into logical, smaller tables and avoids duplicate entries.
• Denormalization: Introduces redundancy by combining related tables, which reduces the need for
complex joins, thus optimizing query performance for read-heavy operations.
Advantages of Denormalization
• Improved Query Performance: By reducing the need for joins, denormalization speeds up read
operations and query performance.
• Reduced Complexity: It simplifies queries and the overall database schema by consolidating data
into fewer tables.
• Easier Maintenance and Updates: Fewer tables make it easier to update the schema or modify
queries.
• Improved Read Performance: The database is optimized for quick read access, which is beneficial
for systems with high read-to-write ratios.
• Better Scalability: Systems that focus on read-heavy operations benefit from denormalization due
to reduced joins and simpler query execution plans.
Disadvantages of Denormalization
• Reduced Data Integrity: Redundant data increases the risk of data inconsistencies. Updates need to
be applied to all copies of duplicated data.
• Increased Complexity: While it can simplify queries, the introduction of redundant data can
complicate database management and schema changes.
• Increased Storage Requirements: Redundant data consumes more storage space, potentially
increasing costs and database size.
• Increased Update and Maintenance Complexity: When data changes, it must be updated in all
places where it appears, which can lead to issues with consistency if not properly managed.
• Limited Flexibility: Redundancy makes it more difficult to modify the database schema, as changes
must be reflected in all places where data is duplicated.
When Should You Use Denormalization?
Denormalization is most useful in systems where read performance is more critical than write performance.
It is ideal for:
• Read-heavy systems: Applications where queries are frequent and must be optimized for quick
retrieval.
• Reporting systems: Where complex queries or aggregations are frequently run and performance is
a priority.
• Data Warehouses: For systems focused on large-scale data analysis and querying.
How to Maintain Data Consistency in Denormalized Databases
To address the main drawback of denormalization—data inconsistency—the following techniques are
typically used:
• Triggers and Stored Procedures: These can be employed to ensure that when data is updated in
one table, all related copies are updated accordingly.
• Caching: Helps avoid repeated updates by storing frequently accessed data.
• Version Control: To keep track of changes and avoid discrepancies across redundant data entries.
Denormalization vs. Data Aggregation
• Denormalization: Involves adding redundant raw data by combining tables, often resulting in larger
table sizes.
• Data Aggregation: Focuses on reducing data size by summarizing it (e.g., computing averages or
totals). Aggregation does not increase redundancy, but rather reduces the total amount of data by
summarizing it.
Conclusion
Denormalization is a useful technique for improving the performance of read-heavy systems by reducing
the complexity of joins. However, it introduces redundancy, which can lead to data integrity issues and
increased maintenance complexity. Its application should be considered carefully, particularly in systems
where performance and scalability are prioritized over strict adherence to normalization principles.
1 T1 Read(A)
2 T1 Write(A)
3 T1 Read(B)
4 T1 Write(B)
5 T2 Read(A)
6 T2 Write(A)
Non-Serial Schedule:
A non-serial schedule allows transactions to execute concurrently, meaning their operations are
interleaved.
Example of a Non-Serial Schedule:
1 T1 Read(A)
Step Transaction Operation
2 T2 Read(B)
3 T1 Write(A)
4 T2 Write(B)
1 READ(A) READ(A) No
1 T1 Read(A)
2 T2 Write(A)
3 T1 Write(A)
1 T1 Read(A)
2 T2 Write(A)
3 T1 Write(A)
• If this schedule produces the same final state as a serial execution, it is view serializable.
• View serializability is weaker than conflict serializability.
1 T1 Read(A)
2 T2 Read(A)
3 T1 Write(A)
4 T2 Write(A)
Conclusion
• Schedules control transaction execution order.
• Serializable schedules ensure correct transaction execution.
• Conflict serializability can be tested using precedence graphs.
• View serializability is less strict than conflict serializability.
1. Conflict Serializability
What is Conflict Serializability?
A schedule is Conflict Serializable if it can be transformed into a serial schedule by swapping non-
conflicting operations without changing the result.
Conflicting Operations:
Two operations conflict if:
1. They belong to different transactions.
2. They operate on the same data item.
3. At least one of them is a WRITE operation.
Types of Conflicts:
1 T1 Read(A)
2 T2 Read(A)
3 T1 Write(A)
4 T2 Write(A)
• T1 Write(A), T2 Read(A) → T1 → T2
• T1 Write(A), T2 Write(A) → T1 → T2
Step 2: Draw Precedence Graph
T1 → T2
No cycle → Conflict Serializable!
1 T1 Read(A)
2 T2 Write(A)
3 T1 Write(A)
• T1 Read(A), T2 Write(A) → T1 → T2
• T1 Write(A), T2 Write(A) → T1 → T2
• T2 Write(A), T1 Write(A) → T2 → T1
Step 2: Draw Precedence Graph
T1 → T2
T2 → T1 (Cycle detected)
2. View Serializability
What is View Serializability?
A schedule is View Serializable if it produces the same final result as a serial schedule, even if operations
cannot be swapped like in conflict serializability.
Conditions for View Serializability:
A schedule is view serializable if it satisfies:
1. Initial Reads are the same – Every transaction reads the same value as in a serial execution.
2. Final Writes are the same – The last write operation in both schedules must be the same.
3. Read-Write Order is Maintained – If a transaction T2 reads a value written by T1, this order must be
preserved in the equivalent serial schedule.
View serializability is more flexible than conflict serializability but harder to test.
1 T1 Read(A)
2 T2 Write(A)
Step Transaction Operation
3 T1 Write(A)
1 T1 Read(A)
2 T2 Write(A)
3 T1 Write(A)
4 T2 Read(A)
Stricter (If a schedule is conflict serializable, More flexible (Some schedules are view
Flexibility
it's always view serializable). serializable but not conflict serializable).
5. Summary
View Serializability Ensures transactions produce the same final result as a serial execution.
Conflict vs. View Conflict serializability is stricter; view serializability is more flexible.
Cycle in Precedence
If a cycle exists → Not conflict serializable.
Graph
1. Recoverability of Schedules
A schedule is recoverable if it ensures that transactions commit only after all transactions whose changes
they depend on have committed.
Types of Recoverability:
1. Recoverable Schedule
2. Cascadeless Schedule
3. Strict Schedule
1 T1 Write(A)
2 T2 Read(A)
3 T1 Commit
4 T2 Commit
1 T1 Write(A)
2 T2 Read(A)
3 T2 Commit
4 T1 Abort
1 T1 Write(A)
2 T2 Read(A)
3 T3 Read(A)
4 T1 Abort
1 T1 Write(A)
2 T1 Commit
3 T2 Read(A)
4. Lock-Based Protocols
Locks restrict access to data items to ensure consistency.
4.1 Granularity of Locks
Granularity defines the level at which locks are applied in the database.
Levels of Lock Granularity:
Exclusive (X) Only one transaction can READ and WRITE. Low
Cascadeless Schedule Transactions read only committed values to prevent cascading rollbacks.
Granularity of Locks Locks can be applied at different levels (DB, Table, Row, etc.).
Shared vs. Exclusive Locks Shared (Read only, allows concurrency), Exclusive (Read/Write, blocks others).
Locking Protocols 2PL, Strict 2PL, and Rigorous 2PL ensure consistency and prevent conflicts.
Example of 2PL:
Example Schedule:
1 T1 Lock-X(A)
2 T1 Write(A)
3 T2 Lock-X(B)
4 T2 Write(B)
This schedule follows 2PL since unlocking happens after acquiring all locks.
Better consistency
More waiting time
2. Timestamp-Based Protocol
The Timestamp-Based Concurrency Control Protocol ensures that transactions execute in order of their
timestamps without locks.
How Timestamps Work?
• Each transaction Ti is assigned a timestamp TS(Ti) when it starts.
• The timestamp represents the order in which transactions should execute.
• Each data item Q has two timestamps:
1. Read Timestamp (RTS(Q)) – Last transaction that read Q.
2. Write Timestamp (WTS(Q)) – Last transaction that wrote Q.
T3 is aborted because it tries to write an old value after a newer write by T2.
Advantages:
Disadvantages:
5. Summary Table
Concept Key Points
Strict 2PL Holds exclusive locks until commit, preventing cascading rollbacks.
Rigorous 2PL Holds all locks until commit, ensuring maximum consistency.
Advantages of Timestamp
No deadlocks, no locks required, guarantees serializability.
Protocols
Disadvantages of Timestamp High abort rate, wasted computation, not ideal for high-contention
Protocols databases.
Prevention of Hold & Wait – A transaction must request all resources at once.
Prevention of Circular Wait – Order transactions to prevent cycles.
Prevention of No Preemption – If a transaction requests a resource and is denied, it must release all
held resources and restart.
Wait-Die Scheme – Older transactions can wait, but younger transactions are aborted.
Wound-Wait Scheme – Older transactions can force younger transactions to abort and restart.
3. Deadlock Detection and Recovery (Allow deadlocks and resolve them when detected)
The system detects deadlocks using a Wait-for Graph and then recovers by aborting transactions.
T1 → T2 → T3 → T1 (Cycle exists )
T1 → T2 (No cycle )
Deadlock resolved!
4. Summary Table
Necessary Conditions for Deadlock Mutual Exclusion, Hold & Wait, No Preemption, Circular Wait.