DBMS Notes
DBMS Notes
Attributes are properties or characteristics of an entity. Attributes are used to describe the entity. The
attribute is nothing but a piece of data that gives more information about the entity. Attributes are
used to distinguish one entity from the other entity. Attributes can help the database to be more
structural and hierarchical.
Example
• Let's take the student as an entity. Students will have multiple attributes such as roll number,
name, and class.
• As shown in the figure, roll_no, name, and class are the attributes of the entity Student.
• Types Of Attribute
Key Attribute
• The attribute which has unique values for every row in the table is known as a Key Attribute.
The key attribute has a very crucial role in the database.
Example
• For students, we can identify every student with roll_no because each student will have a
unique roll_no.
• This indicates that roll_no will be a Key attribute for the Student entity
Composite Attribute
When 2 or more than 2 simple attributes are combined to make an attribute then that attribute is called a
Composite attribute.
• address is the attribute derived from the 3 simple attributes i.e. City, State, and Street.
• To get the value of the address attribute, we have first to know those city, state, and street attributes.
Composite Attribute
Multivalued Attribute
• An attribute which can have multiple values is known as a multivalued attribute. Multivalued
attributes have multiple values for the single instance of an entity.
• One student can have multiple phone_no, so we can say that phone_no can have multiple values.
• Multi-valued attributes are used when more than 1 entries for one attribute need to be stored in the
Database.
Multi-valued attribute
Derived Attribute
• The attribute that can be derived from the other attributes and does not require to be already present
in the database is called a Derived Attribute.
• Derived attributes are not stored in the Database directly. It is calculated by using the stored attributes
in the database.
Example
• Here the student has multiple attributes including DOB and age. It is observed that age can be
calculated with the help of the DOB attribute.
Derived attribute
Degree of Relationship
In DBMS, a degree of relationship represents the number of entity types that are associated with a
relationship. For example, we have two entities, one is a student and the other is a bag and they are
connected with the primary key and foreign key. So, here we can see that the degree of relationship
is 2 as 2 entities are associating in a relationship.
relationship set.
1. Unary Relationship: When there is only ONE entity set participating in a relation,
therelationship is called a unary relationsh
For example,
one person is married to only one
2. Binary Relationship: When there are TWO entities set participating in a relationship,
the relationship is called a binary relationship. For example, a Student is enrolled in a
Course.
Binary Relationship
3. Ternary Relationship: When there are three entity sets participating in a relationship, the
4. N-ary Relationship: When there are n entities set participating in a relationship, the
• One-to-One
• One-to-Many
• Many-to-One
• Many-to-Many
• One-to-One
• In this type of cardinality mapping, an entity in A is connected to at most one entity in
B. Or we can say that a unit or item in B is connected to at most one unit or item in A.
• One to One
One-to-Many
In this type of cardinality mapping, an entity in A is associated with any number of entities in
B. Or we can say that one unit or item in B can be connected to at most one unit or item in
A.One to Many
Example:
In a particular hospital, the surgeon department has multiple doctors. They serve one-to-many
relationships.
Many-to-One
In this type of cardinality mapping, an entity in A is associated with any number of entities in
B, and an entity in B is associated with any number of entities in A
Example:
In a particular company, multiple people work on multiple projects. They serve many-
to-many relationships.
Enhanced ER Diagram
The requirements and complexity of complicated databases are represented using enhanced
entity-relationship diagrams, which are sophisticated database diagrams very similar to
standard ER diagrams.
The SubClass and SuperClass, Specialization and Generalization, Union or Category,
Aggregation, etc., are displayed using this diagrammatic style.
Generalization and Specialization
These are two normal kinds of relationships that were added to the normal ER model for
enhancement. These are inspired by the object-oriented paradigm, where we divide the code
into classes and objects, and in the same way, we have divided entities into subclass and
superclasses. Specialized classes are called subclasses, and generalized classes are called
superclasses or base classes. We can learn the concept of subclass by 'IS-A' analysis. For
example, 'Laptop IS-A computer.' Or 'Clerk IS-A employee.'
In this relationship, one entity is a subclass or superclass of another entity. For
example, in a university, a faculty member or clerk is a specialized class of
employees. So an employee is a generalized class, and all others are its subclass.
We can draw the ER diagram for these relationships. Let's suppose we have a
superclass Employee and subclasses as a clerk, engineer, and lab assistant.
The Enhanced ER diagram of the above example will look like this:
Functional Dependencies
A functional dependency occurs when one attribute uniquely determines another
attribute
within a relation If attribute A functionally determines attribute B we write this as the
A→B.
Types of Functional Dependencies in DBMS
1. Trivial functional dependency
2. Non-Trivial functional dependency
3. Multivalued functional dependency
4. Transitive functional dependency
https://www.youtube.com/watch?v=Ezum7q-wwTA
unit 3
Query Pocessiing
Query Processing includes translations of high-level Queries into low-level expressions thatcan be
used at the physical level of the file system, query optimization, and actual execution of the query to
get the actual result. High-level queries are converted into low-level expressions during query
processing. It is aThe process of extracting data from a database is called query processing. It requires
several
steps to retrieve the data from the database during query processing. The actions involved
actions are:
Optimization
During the optimization stage, the database must perform a hard parse at least for one
Step-3
Evaluation
Finally runs the query and displays the required result
Measures of Query
?
Select operation
operation" refers to the process of retrieving specific data from a database based on certain criteria or
conditions.
Unit 4
What is Transaction Processing?
In a Database Management System (DBMS), transaction processing means handling a
group of operations (called a transaction) as a single unit.
These operations are done together, and either:
• All of them succeed, or
• None of them happen at all.
This ensures the database stays correct and consistent—even if there’s an error, power
failure, or system crash.
What is a Transaction?
A transaction is just a group of actions done together for a specific purpose.
For example, updating a customer’s balance, inserting a new order, or deleting an account.
Account A (Sender)
1. Open Account A
2. Read current balance → Old_Balance = A.balance
3. Subtract Rs. 500 → New_Balance = Old_Balance - 500
4. Update balance → A.balance = New_Balance
5. Close Account A
Account B (Receiver)
1. Open Account B
2. Read current balance → Old_Balance = B.balance
3. Add Rs. 500 → New_Balance = Old_Balance + 500
4. Update balance → B.balance = New_Balance
5. Close Account B
3️ Failed State
• Something went wrong.
o Maybe a system crash,
o Or the transaction couldn't save data properly.
• The transaction can’t continue and must be stopped
4️ Aborted State
• After failure, the transaction rolls back any changes.
• Since changes were only in memory or local buffers, they can be undone.
• Once rollback is done → transaction ends.
5️ Committed State
• The transaction successfully saved all its changes permanently to the database.
• No more going back.
• It can now be marked as completed.
6️ Terminated State
• The final state.
• The transaction is officially over (whether it was committed or aborted).
• The system is now ready to handle the next transaction.
In a Database Management System (DBMS),
• concurrent execution refers to the ability of multiple users to access and work with the
database at the same time. In real-world systems, it is common for many users to perform
different operations—such as reading, updating, or deleting data—simultaneously. This is
especially important in multi-user environments where system efficiency and responsiveness
are critical. During concurrent execution, transactions from different users are interleaved,
meaning their operations are mixed in a way that ensures each transaction is processed
correctly without affecting the others. The main goal is to maintain the consistency and
integrity of the database, even when multiple transactions are happening at once. However,
concurrent execution also introduces challenges, such as the lost update, dirty read,
unrepeatable read, and phantom read problems. These issues occur when transactions
interfere with each other, potentially leading to incorrect or inconsistent data. To avoid these
problems, DBMSs use concurrency control techniques, such as locking mechanisms and
transaction isolation levels, which help manage the execution of transactions in a way that
ensures the final outcome is as if the transactions were executed one after the other
(serially). Thus, concurrent execution is essential for performance, but it must be carefully
managed to preserve data correctness.
Serializability in DBMS means making sure that when multiple transactions run at the
same time, the result is still correct—just like if the transactions had run one after the other.
In a serial schedule, each transaction runs completely before the next one starts, so there is
no chance of errors or data inconsistency. But in real systems, to save time and improve
performance, transactions often run at the same time (called concurrent execution), and
their steps get mixed together.
• This mixing of steps is called a non-serial schedule. Serializability helps us check whether this
mixed schedule still gives the same correct result as a serial one. If it does, then the schedule
is called a serializable schedule. So, serializability is a way to make sure that even though
transactions are running together, the database stays correct and consistent—just like when
they run one by one. All serial schedules are always serializable, but not all non-serial
schedules are.
• Types of Serializability
• 1. Conflict Serializability
• 2. View Serializability
• Conflict Serializable Schedule (Simple Explanation)
• Key Idea:
• Example Summary
•
• S1 = S2 → No conflict → Schedule is conflict serializable
•
• S1 ≠ S2 → Conflict exists → Not conflict serializable
Two schedules are view equivalent if they meet all three of these conditions:
1. Initial Read
The first read of each data item must be from the same transaction in both schedules.
2. Updated Read
If a transaction reads a value that was written by another transaction in one schedule, it
must do the same in the other.
3. Final Write
The last write to any data item must be done by the same transaction in both schedules.
Key Points:
• But some view serializable schedules are not conflict serializable—this happens when there
are blind writes (writes without prior read).
Recoverability in DBMS
Recoverability is a critical feature of database systems that ensures the database can return
to a consistent and reliable state after a failure or error. It guarantees that the effects of
committed transactions are saved permanently, while uncommitted transactions are rolled
back to maintain data integrity.. These logs enable the system to either undo the changes of
uncommitted transactions or redo the committed ones when a failure occurs.
There are several levels of recoverability that can be supported by a database system:
No-undo logging: This level of recoverability only guarantees that committed transactions
are durable, but does not provide the ability to undo the effects of uncommitted
transactions.
Undo logging: This level of recoverability provides the ability to undo the effects of
uncommitted transactions but may result in the loss of updates made by committed
transactions that occur after the failed transaction.
Redo logging: This level of recoverability provides the ability to redo the effects of
committed transactions, ensuring that all committed updates are durable and can be
recovered in the event of failure.
Undo-redo logging: This level of recoverability provides both undo and redo capabilities,
ensuring that the system can recover to a consistent state regardless of whether a
transaction has been committed or not.
Example 1:
T1 T2
R(A)
W(A)
W(A)
R(A)
commit
T1 T2
commit
This is a recoverable schedule since T1 commits before T2, that makes the value read by T2
correct.
Example 2:
Example: T1 wants A, B, C
→ It will request all three at once before beginning.
1. Growing Phase
2. Shrinking Phase
o Can release locks
The point where it stops acquiring and starts releasing is called the lock point.
Summary Table
Prevents
When Locks Are When Locks
Protocol Cascading
Taken Are Released
Aborts?
After
Before each
Simplistic transaction
operation
ends
After
Before transaction
Pre-claiming transaction
starts (all at once)
ends
Two-Phase In shrinking
In growing phase
Locking (2PL) phase
Only after
Strict 2PL In growing phase
commit
This protocol manages the order of transactions based on when they entered the system
(i.e., their timestamps).
Key Concepts:
• Every transaction gets a timestamp when it starts.
How It Works:
• If W_TS(X) > TS(Ti) → Reject the read (Ti is too old, X was already written by a newer
transaction).
• If TS(Ti) < R_TS(X) → Reject (Ti is trying to overwrite something already read by a newer
transaction).
• If TS(Ti) < W_TS(X) → Reject & Rollback (X already written by a newer transaction).
Advantages:
• Ensures serializability.
Disadvantages:
Example:
Let’s say:
• T1 → timestamp = 007
• T2 → timestamp = 009
Then T1 will have higher priority and must execute before T2 if they access the same data.