0% found this document useful (0 votes)
9 views55 pages

Module3,4,5 QAA

The document covers various aspects of SQL, including cursors, normalization (1NF, 2NF, 3NF, BCNF), and relational schema design guidelines. It explains functional dependencies, update anomalies, database transactions, triggers, assertions, and the ACID properties of transactions. Each section provides definitions, examples, and algorithms to illustrate the concepts effectively.

Uploaded by

sambhram169
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views55 pages

Module3,4,5 QAA

The document covers various aspects of SQL, including cursors, normalization (1NF, 2NF, 3NF, BCNF), and relational schema design guidelines. It explains functional dependencies, update anomalies, database transactions, triggers, assertions, and the ACID properties of transactions. Each section provides definitions, examples, and algorithms to illustrate the concepts effectively.

Uploaded by

sambhram169
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

MODULE-3,4 and 5

1. Explain the Cursor & its properties in embedded SQL with an example.
A cursor in SQL is a database object used to retrieve, process, and manipulate data
one row at a time. While SQL is designed to handle large data sets in bulk, sometimes
we just need to focus on one row at a time. A cursor in SQL is a temporary memory or
workspace allocated by the database server to process DML operations.
It allows processing query results row-by-row instead of applying operations to the
entire set
• Performing conditional logic row-by-row.
• Looping through data to calculate or transform fields
• Iterating over result sets for conditional updates or transformations.
• Handling hierarchical or recursive data structures.
• Performing clean-up tasks that can not be done with a single SQL Query.
Implicit Cursors
In PL/SQL, when we perform INSERT, UPDATE or DELETE operations, an implicit cursor
is automatically created. This cursor holds the data to be inserted or identifies the rows
to be updated or deleted.
Useful Attributes:
• %FOUND: True if the SQL operation affects at least one row.
• %NOTFOUND: True if no rows are affected.
• %ROWCOUNT: Returns the number of rows affected.
• %ISOPEN: Checks if the cursor is open.
Explicit Cursors
These are user-defined cursors created explicitly by users for custom operations. They
provide complete control over every part of their lifecycle: declaration, opening,
fetching, closing, and deallocating.
Explicit cursors are useful when:
• We need to loop through results manually
• Each row needs to be handled with custom logic
• We need access to row attributes during processing

2. What is a Normalization? Explain the 1NF, 2NF & 3NF with examples.
Normalization is a systematic approach to organize data within a database to reduce
redundancy and eliminate undesirable characteristics such as insertion, update,
and deletion anomalies. The process involves breaking down large tables into smaller,
well-structured ones and defining relationships between them. This not only reduces
the chances of storing duplicate data but also improves the overall efficiency of the
database.
First normal form (1NF)
In 1NF, every database cell or relation contains an atomic value that can’t be further
divided, i.e., the relation shouldn’t have multivalued attributes.
Example:
The following table contains two phone number values for a single attribute.
Student Phone Number
Emp_ID
Name

1 John 12345767890

9242314321,
2 Claire 7689025341

So to convert it into 1NF, we decompose the table as the following -


Student Phone Number
Emp_ID
Name

1 John 12345767890

2 Claire 9242314321

2 Claire 7689025341

Here, we can notice data repetition, but 1NF doesn’t care about it.

Second Normal Form (2NF)


In 2NF, the relation present should be 1NF, and no partial dependency should exist.
Partial dependency is when the non-prime attributes depend entirely on the
candidate or primary key, even if the primary key is composite.

Example 1: (depicting partial dependency issues)


If given with a relation R(A, B, C, D) where we have {A, B} as the primary key where
A and B can’t be NULL simultaneously, but both can be NULL independently and C, D
are non-prime attributes. If B is NULL, and we are given the functional dependency,
say, B → C. So can this ever hold?
As B contains NULL, it can never determine the value of C. So, as B → C is a partial
dependency, it creates a problem. Therefore, the non-prime attributes cannot be
determined by a part of the primary key. We can remove the partial dependency
present by creating two relations ( the 2NF conversion)-
Relation 1 = R1(ABD), where {A, B} is the primary key. AB determines D.
Relation 2 = R1(BC), where B is the primary key. And from this, B determines C.

Example 2:
Consider the following table. Its primary key is {StudentId, ProjectId}.
The Functional dependencies given are -
StudentId → StudentName
ProjectId → ProjectName

Student Project
StudentId ProjectId Name
Name

1 P2 John IOT

2 P1 Claire Cloud

3 P7 Clara IOT

4 P3 Abhk Cloud

As it represents partial dependency, we decompose the table as follows -


Student
StudentId ProjectId Name

1 P2 John

2 P1 Claire

3 P7 Clara

4 P3 Abhk

ProjectId Project Name

P2 IOT

P1 Cloud
ProjectId Project Name

P7 IOT

P3 Cloud

Here projectId is mentioned in both tables to set up a relationship between them.


Third Normal Form (3NF)
In 3NF, the given relation should be 2NF, and no transitivity dependency should exist,
i.e., non-prime attributes should not determine non-prime attributes.
Example:
Consider the following scenario where the functional dependencies are -
A → B and B → C, where A is the primary key.
As here, a non-prime attribute can be determined by a prime attribute, which implies
transitivity dependency exists. To remove this, we decompose this and convert it into
3NF. So, we create two relations -
R1(A, B), where A is the primary key and R2(B, C), where B is the primary key.
Boyce-Codd Normal Form(BCNF)
In BCNF, the relation should be in 3NF.If given a relation, say A → B, A should be a
super key in this. This implies that no prime attribute should be determined or
derived from any other prime or non-prime attribute.
Example:
Given the following table. Its candidate keys are {Student, Teacher} and {Student,
Subject}.
The Functional dependencies given are -
{Student, Teacher} → Subject
{Student, Subject} → Teacher
Teacher → Subject
Student Teacher
Subject
Name

John Physics Olivia

Claire English Emma

Clara Physics Olivia

Abhk English Sophia


As this table is not in BCNF form, so we decompose it into the following tables:

Student Name Teacher

John Olivia

Clara Emma

Robin Olivia

Kaley Sophia

Teacher Subject

Olivia Physics

Emma English

Olivia Physics

Sophia English

Here Teacher is mentioned in both tables to set up a relationship between them.

3. Explain informal design guidelines for relational schema design.


1.1 Semantics of the Relation Attributes
GUIDELINE 1: Informally, each tuple should represent one entity or relationship
instance.
- Attributes of different entities (EMPLOYEEs, DEPARTMENTs, PROJECTs) should not
be mixed in the same relation
- Only foreign keys should be used to refer to other entities
- Entity and relationship attributes should be kept apart as much as possible.
Bottom Line:

Design a schema that can be explained easily relation by relation. The semantics of
attributes should be easy to interpret.
1.2 Redundant Information in Tuples and Update Anomalies
- Mixing attributes of multiple entities may cause problems
- Information is stored redundantly wasting storage
- Problems with update anomalies:
- Insertion anomalies
- Deletion anomalies
- Modification anomalies
GUIDELINE 2: Design a schema that does not suffer from the insertion, deletion and
update anomalies. If there are any present, then
note them so that applications can be made to take them into account.
1.3 Null Values in Tuples

GUIDELINE 3: Relations should be designed such that their tuples will have as few
NULL values as possible

- Attributes that are NULL frequently could be placed in separate relations (with the
primary key)

- Reasons for nulls:


a. attribute not applicable or invalid
b. attribute value unknown (may exist)
c. value known to exist, but unavailable

1.4 Spurious Tuples


- Bad designs for a relational database may result in erroneous results for certain
JOIN operations
- The "lossless join" property is used to guarantee meaningful results for join
operations
GUIDELINE 4: The relations should be designed to satisfy the lossless join condition.
No spurious tuples should be generated by doing a
natural-join of any relations.

4. What is Functional Dependency? Write algorithm to find minimal cover for 10 set of
Functional Dependency. Construct the minimal cover m for set of functional
dependency.

Ans: A Functional Dependency (FD) in a relational database represents a relationship


between two sets of attributes.
• If an attribute set X functionally determines Y (written as X → Y), it means:
For any two tuples in the relation, if they agree on X, they must also agree on Y.
Example:
If B → A, then knowing the value of B lets you determine the value of A.
Algorithm to Find Minimal Cover
1. Decompose the FDs so that each has a single attribute on the right-hand side (RHS).
2. Minimize the left-hand side (LHS) of each FD (remove any redundant attributes).
3. Remove redundant FDs (if a FD can be derived from others, remove it).
Step-by-step Construction of Minimal Cover
Step 1: Decompose RHS (if needed)
4. In this case:
B → A (Already in single attribute)
D → A
AB → D
No changes needed here.

Step 2: Minimize LHS of each FD


Let’s check AB → D:
• Check if A is redundant in AB → D.
o Test: Does B → D hold?
▪ Closure of B using E = {B → A, D → A, AB → D}:
▪ From B → A, we get A
▪ So B+ = {B, A}
▪ With AB → D, and we have both A and B, so get D
▪ So B+ = {A, B, D}
Therefore, B → D holds, so A is redundant.
So we can replace AB → D with B → D.
Step 3: Remove redundant FDs
Let’s test for redundancy:
Check if B → A is redundant:
• From B → D, and D → A, then B → A is implied.
So B → A is redundant.
Remove it.
Now we have:
D→A
B→D
Check if this set can still derive all original dependencies.
Compute closure of B:
• B→D
• D→A
So B+ = {B, D, A}
Original FD set:
• B→A
• D→A
• AB → D (we now use B → D)
So we’re good.
Final Minimal Cover m:
mathematica
CopyEdit
m={
B → D,
D→A
}
• Functional Dependency: A rule that specifies a relationship between attributes in a
relation.
• Minimal Cover (m): A simplified but equivalent version of the original FD set with:
o Single RHS attributes,
o No redundant LHS attributes,
o No redundant FDs.
• Result for E = {B → A, D → A, AB → D}:
Minimal Cover m = { B → D, D → A }
Let me know if you'd like a diagram or derivation steps shown for this process.

5. Explain the types of update anomalies in SQL with an example.


Insert Anomaly : An Insert Anomaly occurs when certain attributes cannot be
inserted into the database without the presence of other attributes.
Delete Anomaly: A Delete Anomaly exists when certain attributes are lost because
of the deletion of other attributes.
Update Anomaly: An Update Anomaly exists when one or more instances of
duplicated data is updated, but not all.
▪ Below table University consists of seven attributes: Sid, Sname, Cid, Cname, Fid,
Fname, and Salary. And the Sid acts as a key attribute or a primary key in the
relation.
6. Demonstrate the Database Transaction with transaction diagram.
A transaction can be viewed as a set of operations used to perform a logical set of
tasks. Transactions are used to change data in the database. This can be done by
inserting new data, modifying existing data, or deleting existing data.

Active state: This is the very first state of the transaction. All the read-write
operations of the transaction are currently running then the transaction is in the
active state. If there is any failure, it goes to the failed state. If all operations are
successful then the transaction moves to the partially committed state. All the
changes that are carried out in this stage are stored in the buffer memory
• Partially Committed state: Once all the instructions of the transaction are
successfully executed, the transaction enters the Partially Committed state. If the
changes are made permanent from the buffer memory, then the transaction enters
the Committed state. Otherwise, if there is any failure, it enters the failed state. The
main reason for this state is that every time a database operation is performed, a
transaction can involve a large number of changes to the database, and if a power
failure or other technical problem occurs when the system goes down the
transaction will result in Inconsistent changes to the database.
Committed state: Once all the operations are successfully executed and the
transaction is out of the partially committed state, all the changes become
permanent in the database. That is the Committed state. There’s no going back! The
changes cannot be rolled back and the transaction goes to the terminated state.

Failed state: In case there is any failure in carrying out the instructions while the
transaction is in the active state, or there are any issues while saving the changes
permanently into the database (i.e. in the partially committed stage) then the
transaction enters the failed state.
• Aborted state: If any of the checks fail and the transaction reaches a failed state, the
database recovery system ensures that the database is in a previously consistent
state. Otherwise, the transaction is aborted or rolled back, leaving the database in a
consistent state. If a transaction fails in the middle of a transaction, all running
transactions are rolled back to a consistent state before executing the transaction.
• Terminated state: If a transaction is aborted, then there are two ways of recovering
the DBMS, one is by restarting the task, and the other is by terminating the task and
making itself free for other transactions. The latter is known as the terminated state.

7. Demonstrate working of Assertion & Triggers in SQL? Explain with an example.


Triggers and Assertions

Triggers and assertions are database objects used to enforce data integrity and automate
certain actions within a database.

Triggers: Triggers are special types of stored procedures that are automatically executed
or fired when certain events occur in a database. These events can include INSERT,
UPDATE, or DELETE operations on a table. Triggers are commonly used to enforce
business rules, audit changes, or maintain data consistency.

Example of a trigger:

CREATE TRIGGER trg_after_insert

AFTER INSERT ON employees

FOR EACH ROW

BEGIN

INSERT INTO audit_table (action, date) VALUES ('insert',


NOW());
END;

In this example, a trigger is created to automatically insert a record into an audit table
whenever a new record is inserted into the employees table.

Assertions: Assertions are conditions that are defined and enforced at the database level
to ensure that the data in the database meets certain criteria. They are typically used to
enforce business rules or constraints that cannot be expressed using primary key, foreign
key, or check constraints.

Example of an assertion:

CREATE ASSERTION salary_check

CHECK (

SELECT COUNT(*)

FROM employees

WHERE salary > 0

) = (SELECT COUNT(*) FROM employees);

In this example, an assertion named "salary_check" is created to ensure that the salary of
all employees is greater than 0. The assertion uses a SELECT statement to compare the
count of employees with a salary greater than 0 to the total count of employees, ensuring
that all employees have a positive salary.

Triggers and assertions are powerful tools for maintaining data integrity and enforcing
business rules within a database.

8. Demonstrate the ACID properties of database transaction.


ACID Properties or DESIRABLE PROPERTIES OF TRANSACTIONS In DBMS ACID
(Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee
that database transactions are processed reliably. In the context of databases, a
single logical operation on the data is called a transaction.
For example, a transfer of funds from one bank account to another, even involving
multiple changes such as debiting one account and crediting another, is a single
transaction.
Atomicity: Atomicity refers to the ability of the DBMS to guarantee that either all
of the operations of a transaction are performed or none of them are. Database
modifications must follow an all or nothing rule. Each transaction is said to be atomic
if when one part of the transaction fails, the entire transaction fails. The atomicity
property requires that we execute a transaction to completion. It is the responsibility
of the transaction recovery subsystem of a DBMS to ensure atomicity. If a
transaction fails to complete for some reason, such as a system crash in the midst of
transaction execution, the recovery technique must undo any effects of the
transaction on the database.
Consistency: The consistency property ensures that the database remains in a
consistent state before the start of the transaction and after the transaction is over
(whether successful or not). The preservation of consistency is generally considered
to be the responsibility of the programmers who write the database programs or of
the DBMS module that enforces integrity constraints. A consistent state of the
database satisfies the constraints specified in the schema as well as any other
constraints that should hold on the database. A database program should be written
in a way that guarantees that, if the database is in a consistent state before executing
the transaction, it will be in a consistent state after the complete execution of the
transaction,
Isolation: The isolation portion of the ACID Properties is needed when there are
concurrent transactions. Concurrent transactions are transactions that occur at the
same time, such as shared multiple users accessing shared objects. Although
multiple transactions may execute concurrently, each transaction must be
independent of other concurrently executing transactions. A transaction should
appear as though it is being executed in isolation from other transactions. That is,
the execution of a transaction should not be interfered with by any other
transactions executing concurrently. In a database system where more than one
transaction are being executed simultaneously and in parallel, the property of
isolation states that all the transactions will be carried out and executed as if it is the
only transaction in the system. No transaction will affect the existence of any other
transaction.
Durability: Maintaining updates of committed transactions is critical. These updates
must never be lost. The ACID property of durability addresses this need. Durability
refers to the ability of the system to recover committed transaction updates if either
the system or the storage media fails. Features to consider for durability: recovery
to the most recent successful commit after a database software failure recovery to
the most recent successful commit after an application software failure recovery to
the most recent successful commit after a CPU failure recovery to the most recent
successful backup after a disk failure recovery to the most recent successful commit
after a data disk failure
9. Demonstrate the System Log in database transaction.
Log or Journal keeps track of all transaction operations that affect the values of
database items This information may be needed to permit recovery from transaction
failures.
The log is kept on disk, so it is not affected by any type of failure except for disk or
catastrophic failure one (or more) main memory buffers hold the last part of the log
file, so that log entries are first added to the main memory buffer
When the log buffer is filled, or when certain other conditions occur, the log buffer
is appended to the end of the log file on disk.
In addition, the log is periodically backed up to archival storage (tape) to guard
against such catastrophic failures
The following are the types of entries called log records that are written to the log
file and the corresponding action for each log record.
In these entries, T refers to a unique transaction-id that is generated automatically
by the system for each transaction and that is used to identify each transaction:
1. [start_transaction, T]. Indicates that transaction T has started execution.
2. [write_item, T, X, old_value, new_value]. Indicates that transaction T has changed
the value of database item X from old_value to new_value.
3. [read_item, T, X]. Indicates that transaction T has read the value of database item
X.
4. [commit, T]. Indicates that transaction T has completed successfully, and affirms
that its effect can be committed (recorded permanently) to the database.
5. [abort, T]. Indicates that transaction T has been aborted.

8. Explain stored procedure language in SQL with an example.


A stored procedure in SQL is a set of SQL statements that are saved and stored in the
database. They are used to encapsulate logic for reuse, simplify complex operations,
and improve performance. Stored procedures can accept input parameters, return
output parameters, and contain control-of-flow constructs like IF, WHILE, etc.
Reusability: You can call the procedure multiple times from different parts of your
application.
Maintainability: Logic is centralized and easier to manage.
Security: Permissions can be granted to execute the procedure without giving access
to the underlying tables.
Performance: Stored procedures are precompiled, so execution is faster than
sending many individual queries.
Syntax (Generic)
CREATE PROCEDURE procedure_name (
IN parameter1 datatype,
OUT parameter2 datatype
)
BEGIN
-- SQL statements
END;
Example:
CREATE TABLE employees (
id INT,
name VARCHAR(100),
salary DECIMAL(10, 2)
);
Here’s a stored procedure to increase the salary of an employee by a given
percentage:
CREATE PROCEDURE IncreaseSalary(
IN emp_id INT,
IN percentage DECIMAL(5,2)
)
BEGIN
UPDATE employees
SET salary = salary + (salary * percentage / 100)
WHERE id = emp_id;
END //
CALL IncreaseSalary(101, 10);

10. Demonstrate the Two phase locking protocol used for concurrency control.
Locking in a database management system is used for handling transactions in
databases. The two-phase locking protocol ensures serializable conflict schedules. A
schedule is called conflict serializable
• Shared Lock: Data can only be read when a shared lock is applied. Data cannot be
written. It is denoted as lock-S
• Exclusive lock: Data can be read as well as written when an exclusive lock is applied.
It is denoted as lock-X
Growing Phase: In the growing phase, the transaction only obtains the lock. The
transaction can not release the lock in the growing phase. Only when the data
changes are committed the transaction starts the Shrinking phase.
Shrinking Phase: Neither locks are obtained nor they are released in this phase.
When all the data changes are stored, only then the locks are released.
Time Action Notes

t1 T1 requests X-lock on A Growing phase starts for T1

t2 T1 reads and writes A Allowed (has exclusive lock)

t3 T2 requests S-lock on A Blocked (T1 holds X-lock)

t4 T1 requests S-lock on B Allowed (still in growing phase)

t5 T1 reads B

t6 T1 releases lock on A Shrinking phase starts for T1

t7 T1 releases lock on B

t8 T2 granted S-lock on A Allowed now (T1 released lock)

t9 T2 reads A
t10 T2 releases lock on A

• Ensures conflicting operations are serialized.


• Prevents read-write or write-write conflicts.
• Guarantees serializability of transactions.
11. Explain different types of Two phase Locking System
Two-phase Locking is further classified into three types :
1. Strict two-phase locking protocol :
▪ The transaction can release the shared lock after the lock point.
▪ The transaction can not release any exclusive lock until the
transaction is committed.
▪ In strict two-phase locking protocol, if one transaction rollback then
the other transaction should also have to roll back. The transactions
are dependent on each other. This is called Cascading schedule.
2. Rigorous two-phase locking protocol :
The transaction cannot release either of the locks, i.e., neither shared lock
nor exclusive lock. Serializability is guaranteed in a Rigorous two-phase
locking protocol. Deadlock is not guaranteed in the rigorous two-phase
locking protocol.
2. Conservative two-phase locking protocol :
The transaction must lock all the data items it requires in the transaction before the
transaction begins.

• If any of the data items are not available for locking before execution of the lock,
then no data items are locked.

• The read-and-write data items need to be known before the transaction begins. This
is not possible normally.

• Conservative two-phase locking protocol is deadlock-free.

12. Demonstrate the Concurrency control based on Timestamp ordering


Timestamp ordering is a Concurrency control protocol that uses timestamps to order
transactions and ensure they execute in SERIALIZABLE Manner
The Timestamp ordering protocol is used to order the transactions based on their
timestamp
The order of transaction is nothing but the ascending order of the transaction
creation
The timestamp ordering protocol also maintain the timestamp of last ‘read’ and
‘write’ operation on a data
Each transaction is issued a timestamp when it enters the system.
 If an old transaction Ti has timestamp TS(Ti), a new transaction Tk is assigned
timestamp TS(Tj) such that TS(Ti) < TS(Tk).
The protocol manages concurrent execution such that the timestamps determine the
serializability order

Each transaction is assigned a timestamp when it starts:


• TS(T) = timestamp of transaction T
Each data item X has:
• read_TS(X) = the largest timestamp of any transaction that successfully read X
• write_TS(X) = the largest timestamp of any transaction that successfully wrote X

n The timestamp ordering protocol ensures that any conflicting read and write
operations are executed in timestamp order.
n Suppose a transaction Ti issues a read(Q)
1. If TS(Ti)  W-timestamp(Q), then Ti needs to read a value of Q that was
already overwritten. Hence, the read operation is rejected, and Ti is rolled
back.
2. If TS(Ti)  W-timestamp(Q), then the read operation is executed, and R-
timestamp(Q) is set to the maximum of R-timestamp(Q) and TS(Ti).
n Suppose that transaction Ti issues write(Q).
1. If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing was needed
previously, and the system assumed that that value would never be
produced. Hence, the write operation is rejected, and Ti is rolled back.
2. If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an obsolete value
of Q. Hence, this write operation is rejected, and Ti is rolled back.
3. Otherwise, the write operation is executed, and W-timestamp(Q) is set to
TS(Ti).
Example

13. Why Concurrency control is needed? Demonstrate with an example.


If transactions are executed serially, i.e., sequentially with no overlap in time, no
transaction concurrency exists. However, if concurrent transactions with
interleaving operations are allowed in an uncontrolled manner, some unexpected,
undesirable result may occur, such as:
The lost update problem: A second transaction writes a second value of a data-item
(datum) on top of a first value written by a first concurrent transaction, and the
first value is lost to other transactions running concurrently which need, by their
precedence, to read the first value. The transactions that have read the wrong
value end with incorrect results.
The dirty read problem: Transactions read a value written by a transaction that has
been later aborted. This value disappears from the database upon abort, and
should not have been read by any transaction (“dirty read”). The reading
transactions end with incorrect results.
The incorrect summary problem: While one transaction takes a summary over the
values of all the instances of a repeated data-item, a second transaction updates
some instances of that data-item. The resulting summary does not reflect a correct
result for any (usually needed for correctness) precedence order between the two
transactions (if one is executed before the other), but rather some random result,
depending on the timing of the updates, and whether certain update results have
been included in the summary or not.
Most high-performance transactional systems need to run transactions
concurrently to meet their performance requirements. Thus, without concurrency
control such systems can neither provide correct results nor maintain their
databases consistently.
Example:
Suppose two bank clerks are accessing the same customer account at the same
time to update the balance.
• Initial Balance: ₹10,000
Transaction T1 (Clerk A):
Reads balance → ₹10,000
Deposits ₹5,000 → Balance becomes ₹15,000
Writes new balance
Transaction T2 (Clerk B):
Reads balance → ₹10,000
Withdraws ₹3,000 → Balance becomes ₹7,000
Writes new balance
Database without Concurrency control
Final balance in DB = ₹7,000, but it should be ₹12,000
→ T1's update was lost due to T2 overwriting the balance
If concurrency control (like locking or timestamp ordering) is applied:
• T1 locks the record while updating.
• T2 must wait until T1 finishes.
• This ensures updates happen one after another, preserving correctness.
Final correct balance = ₹12,000

14. What is NOSQL? Explain the CAP theorem.


Ans: NoSQL stands for "Not Only SQL." It refers to a class of database management
systems that are designed to handle large volumes of unstructured, semi-
structured, or structured data and are optimized for performance, scalability, and
flexibility. Unlike traditional relational databases (RDBMS) that use tables, rows,
and columns, NoSQL databases use various data models:
Types of NoSQL Databases:
1. Document Stores (e.g., MongoDB, CouchDB) – Store data in JSON-like documents.
2. Key-Value Stores (e.g., Redis, DynamoDB) – Store data as key-value pairs.
3. Column-Family Stores (e.g., Cassandra, HBase) – Use column-oriented storage.
4. Graph Databases (e.g., Neo4j) – Store data as nodes and relationships.

CAP THEOREM

The CAP Theorem is an important concept in distributed database systems that helps
architects and designers understand the trade offs while designing a system.
It states that a system can only guarantee two of three properties: Consistency,
Availability, and Partition Tolerance. This means no system can do it all, so
designers must make smart choices based on their needs.

Consistency

Consistency defines that all clients see the same data simultaneously, no matter
which node they connect to in a distributed system. For eventual consistency, the
guarantees are a bit loose. Eventual consistency gurantee means client will
eventually see the same data on all the nodes at some point of time in the future.
Consistency
Below is the explaination of the above Diagram:
• All nodes in the system see the same data at the same time. This is because the
nodes are constantly communicating with each other and sharing updates.
• Any changes made to the data on one node are immediately propagated to all
other nodes, ensuring that everyone has the same up-to-date information.
Availability
Availabilty defines that all non-failing nodes in a distributed system return a
response for all read and write requests in a bounded amount of time, even if one
or more other nodes are down.

• User send requests, even though we don't see specific network components. This
implies that the system is available and functioning.
• Every request receives a response, whether successful or not. This is a crucial
aspect of availability, as it guarantees that users always get feedback.

Partition Tolerance
Partition Tolerance defines that the system continues to operate despite arbitrary
message loss or failure in parts of the system. Distributed systems guranteeing
partition tolerance can gracefuly recover from partitions once the partition heals.

• Addresses network failures, a common cause of partitions. It suggests that the


system is designed to function even when parts of the network become
unreachable.
• The system can adapt to arbitrary partitioning, meaning it can handle
unpredictable network failures without complete failure.

CAP TRADEOFF

. CA System
A CA System delivers consistency and availiability across all the nodes. It can't do
this if there is a partition between any two nodes in the system and therefore
does't support partition tolerance.
2. CP System
A CP System delivers consistency and partition tolerance at the expense of
availability. When a partition occurs between two nodes, the systems shuts down
the non-available node until the partition is resolved. Some of the examples of the
databases are MongoDB, Redis, and HBase.
3. AP System
An AP System availabiiity and partition tolerance at the expense of consistency.
When a partition occurs, all nodes remains available, but those at the wrong end of
a partition might return an older version of data than others. Example: CouchDB,
Cassandra and Dyanmo DB, etc.
15. What are document based NOSQL systems? Explain basic operations CRUD in
MongoDB
Document-Based NoSQL Systems are a type of NoSQL database that store data in
document-like structures, typically using formats like JSON, BSON (Binary JSON), or
XML. These databases are designed to handle unstructured, semi-structured, or
structured data and provide flexibility, scalability, and high performance, especially
for web and big data applications.
1. Schema-less: Documents can have different fields, allowing dynamic changes to
data structure.
2. Document-Oriented: Each record is stored as a document (e.g., JSON).
3. Nested Data: Documents can contain nested structures such as arrays and sub-
documents.
4. Indexing Support: Most systems offer indexing for fast query performance.
5. Horizontal Scalability: Easy to scale across servers.
Popular Document-Based NoSQL Databases:
• MongoDB (most widely used)
• CouchDB
• Amazon DocumentDB
• RethinkDB

The basic methods of interacting with a MongoDB server are called CRUD
operations. CRUD stands for Create, Read, Update, and Delete. These CRUD
methods are the primary ways you will manage the data in your databases.
CRUD operations describe the conventions of a user interface that let users view,
search, and modify parts of the database.
• The Create operation is used to insert new documents in the MongoDB database.
• The Read operation is used to query a document in the database.
• The Update operation is used to modify existing documents in the database.
• The Delete operation is used to remove documents in the database.
Create Operations
For MongoDB CRUD, if the specified collection doesn’t exist, the create operation
will create the collection when it’s executed. Create operations in MongoDB target
a single collection, not multiple collections. Insert operations in MongoDB
are atomic on a single document level.
MongoDB provides two different create operations that you can use to insert
documents into a collection:
• db.collection.insertOne()
• db.collection.insertMany()

Read Operations

The Read operations are used to retrieve documents from the collection, or in other
words, read operations are used to query a collection for a document. We can perform
read operation using the following method provided by the MongoDB:

Method Description

db.collection.find() It is used to retrieve documents from the collection.

db.collection.findOne()

. Read Operations
The Read operations are used to retrieve documents from the collection,
or in other words, read operations are used to query a collection for a
document. We can perform read operation using the following method
provided by the MongoDB:
Method Description

db.collection.find() It is used to retrieve documents from the collection.

Retrieves a single document that matches the


db.collection.findOne()
query criteria.

Update Operations
The update operations are used to update or modify the existing document in the
collection. We can update a single document or multiple documents that match a
given query. We can perform update operations using the following methods
provided by the MongoDB:
Description
Method

It is used to update a single document in


db.collection.updateOne() the collection that satisfy the given
criteria.

It is used to update multiple documents in


db.collection.updateMany() the collection that satisfy the given
criteria.

It is used to replace single document in


db.collection.replaceOne() the collection that satisfy the given
criteria.

Delete Operations
The delete operation are used to delete or remove the documents from a
collection. We can delete documents based on specific criteria or remove all
documents. We can perform delete operations using the following methods
provided by the MongoDB:

Description
Method

It is used to delete a single document from


db.collection.deleteOne() the collection that satisfy the given
criteria.

It is used to delete multiple documents


db.collection.deleteMany() from the collection that satisfy the given
criteria.

16. What is NOSQL Graph database? Explain Neo4j.


A NoSQL graph database is a type of database that stores and manages data as a
network of interconnected nodes and relationships, unlike traditional relational
databases that use tables and rows. Neo4j is a prominent example of a NoSQL
graph database, offering a native graph processing engine for efficient traversal and
analysis of interconnected data.
Neo4j is a graph database, a type of NoSQL database that stores data as nodes,
relationships, and properties, rather than in tables or documents. Unlike traditional
relational databases, Neo4j excels in handling and querying highly interconnected
data. It is a popular choice for applications where relationships between data
points are crucial, such as social networks, fraud detection, and knowledge graphs.
• Nodes: Represent entities or objects in the graph.
• Relationships: Define connections between nodes, establishing how they are
related.
• Properties: Attributes associated with nodes or relationships, providing additional
information.
• Cypher: Neo4j's native query language for interacting with the graph database.
• Native Graph Storage: Neo4j's architecture is designed to handle graph data
natively, unlike some other databases that might add graph capabilities on top of
an existing relational model.
• Flexibility and Scalability: Neo4j's graph model is flexible and can adapt to changing
data structures, and its architecture is designed to scale well.

Features of Neo4j Data Model:


• Labels and properties: A node label can be declared/specified when a node is
created.
o It is also possible to create nodes without any labels.
• Relationships and relationship types: "->" specifies the direction of the
relationship.
o The relationship can be traversed in either direction.
• Paths: A path specifies a traversal of part of the graph. Typically it is used as part of
a query to specify a pattern, where the query will retrieve from the graph data that
matches the pattern.
o A path is typically specified by a start node, followed by one or more
relationships, leading to one or more end nodes that satisfy the pattern.
• Optional Schema: Graphs can be created and used without a schema(Optional).
o The main features related to schema creation involve creating indexes and
constraints based on the labels and properties.
Indexing and node identifiers: The Neo4j system creates an internal unique system-
defined identifier for each node, when a node is created. To retrieve individual
nodes using other properties of the nodes efficiently, the user can create indexes
for the collection of nodes that have a particular label.

17. Illustrate insert, delete, update, alter & drop commands in SQL.
INSERT: This operation allows you to add new records or rows to a table.
UPDATE: The UPDATE operation enables you to modify existing records in a table.
DELETE: The DELETE operation allows you to remove records from a table.
The INSERT Statement
The INSERT statement is used to add new data into a table. It allows you to specify
the columns to which you want to insert data, as well as the values for each
column. The basic syntax for the INSERT statement is as follows:
INSERT INTO table_name (column1, column2, column3, ...)
VALUES (value1, value2, value3, ...);
The UPDATE Statement
The UPDATE statement is used to modify existing records in a table. It allows you to
change the values of specific columns based on certain conditions. The basic syntax
for the UPDATE statement is as follows:
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
The DELETE Statement
The DELETE statement is used to remove records from a table based on certain
conditions. It allows you to specify which rows you want to delete. The basic syntax
for the DELETE statement is as follows:
DELETE FROM table_name
WHERE condition;
CREATE TABLE
The CREATE TABLE command creates a new table in the database.
The following SQL creates a table called "Persons" that contains five columns:
PersonID, LastName, FirstName, Address, and City:
ExampleGet your own SQL Server
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);

CREATE TABLE Using Another Table


A copy of an existing table can also be created using CREATE TABLE.
The following SQL creates a new table called "TestTables" (which is a copy of the
"Customers" table):
Example
CREATE TABLE TestTable AS
SELECT customername, contactname
FROM customers;
ALTER TABLE
The ALTER TABLE command adds, deletes, or modifies columns in a table.
The ALTER TABLE command also adds and deletes various constraints in a table.
The following SQL adds an "Email" column to the "Customers" table:
Example
ALTER TABLE Customers
ADD Email varchar(255);
The following SQL deletes the "Email" column from the "Customers" table:
Example
ALTER TABLE Customers
DROP COLUMN Email;

DROP TABLE
The DROP TABLE command deletes a table in the database.
The following SQL deletes the table "Shippers":
Example
DROP TABLE Shippers;
Note: Be careful before deleting a table. Deleting a table results in loss of all
information stored in the table!

18. Explain informal design guidelines for relational schema design.


1.Semantics of the Attributes
2.Reducing the Redundant Value in Tuples.
3.Reducing Null values in Tuples.
4.Dissallowing spurious Tuples.
1. Semantics of the Attributes Whenever we are going to form relational schema
there should be some meaning among the attributes. This meaning is called
semantics. This semantics relates one attribute to another with some relation.

Eg:

3. Reducing the Redundant Value in Tuples Mixing attributes of multiple entities may
cause problems Information is stored redundantly wasting storage Problems with
update anomalies
Insertion anomalies
Deletion anomalies
Modification anomalies
Here whenever if we insert the tuples there may be ‘N’ stunents in one
department,so Dept No,Dept Name values are repeated ‘N’ times which leads to
data redundancy.
Another problem is updata anamolies ie if we insert new dept that has no
students.
If we delet the last student of a dept,then whole information about that
department will be deleted
If we change the value of one of the attributes of aparticaular table the we must
update the tuples of all the students belonging to thet depy else Database will
become inconsistent.
4. Reducing Null values in Tuples.
Note: Relations should be designed such that their tuples will have as few NULL
values as possible
Attributes that are NULL frequently could be placed in separate relations (with the
primary key)
Reasons for nulls: attribute not applicable or invalid
attribute value unknown (may exist)value known to exist, but unavailable
5. Disallowing spurious Tuples Bad designs for a relational database may result in
erroneous results for certain JOIN operations
The "lossless join" property is used to guarantee meaningful results for join
operations
Note: The relations should be designed to satisfy the lossless join condition.
No spurious tuples should be generated by doing a natural-join of any relations.

19. What is Functional dependency? Explain the inference rules for functional
dependency with proof.
In relational database theory, a functional dependency (FD) is a constraint between
two sets of attributes in a relation from a database. A functional dependency is
denoted as:
X→Y
Where:
• X and Y are subsets of attributes of a relation R.
• It means that if two tuples (rows) have the same values for attributes in X, then they
must have the same values for attributes in Y.

In other words, X functionally determines Y

Example:

Consider a relation Student(ID, Name, Department).


Here, ID → Name means if two students have the same ID, they must have the same Name.

• There are 6 inference rules, which are defined below:


• Reflexive Rule: According to this rule, if B is a subset of A then A logically
determines B. Formally, B ⊆ A then A → B.
• Example: Let us take an example of the Address (A) of a house, which contains so
many parameters like House no, Street no, City etc. These all are the subsets of A.
Thus, address (A) → House no. (B).
• Augmentation Rule: It is also known as Partial dependency. According to this rule,
If A logically determines B, then adding any extra attribute doesn't change the basic
functional dependency.
• Example: A → B, then adding any extra attribute let say C will give AC → BC and
doesn't make any change.
• Transitive rule: Transitive rule states that if A determines B and B determines C,
then it can be said that A indirectly determines B.
• Example: If A → B and B → C then A → C.
• Union Rule: Union rule states that If A determines B and C, then A determines BC.
• Example: If A → B and A → C then A → BC.
• Decomposition Rule: It is perfectly reverse of the above Union rule. According to
this rule, If A determined BC then it can be decomposed as A → B and A → C.
• Example: If A → BC then A → B and A → C.
• Pseudo Transitive Rule: According to this rule, If A determined B and BC determines
D then BC determines D.
• Example: If A → B and BC → D then AC → D.
20. Demonstrate transaction states & additional operations.
A database transaction progresses through several states and involves operations
for managing its execution. Here's a breakdown of transaction states and common
operations:
Transaction States:
• Active:
The transaction is currently executing and modifying the database.
• Partially Committed:
The transaction has completed some operations but not all, often during the final
stages of execution.
• Failed:
The transaction has encountered an error or a check has failed, preventing it from
continuing.
• Aborted:
If a transaction fails, it's rolled back, and any changes are undone to restore the
database to its original state.
• Committed:
The transaction has successfully completed all operations, and its changes are
permanently saved in the database.
Additional Operations:
• BEGIN TRANSACTION: Initiates a new transaction, marking its start.
• READ/WRITE: Database operations that access or modify data.
• END TRANSACTION: Signals the completion of a transaction's operations.
• COMMIT: Persistently saves the changes made by the transaction.
• ROLLBACK: Undoes all changes made by the transaction, restoring the database to
its previous state.
• Undo/Redo: Specific operations used in recovery to undo failed operations or re-
execute committed operations

21. Define Schedule? Illustrate with an example.


A schedule is a planned timetable that outlines when tasks or events are to occur.
It helps in organizing activities in a sequential manner to ensure efficient time
management and resource allocation. Schedules are widely used in fields like
education, business, construction, transportation, and project management.
22. Explain types and Characterizing Schedules based on Serializability
schedules that are always considered to be correct when concurrent transactions
are executing are known as serializable schedules
Suppose that two users for example, two airline reservations agents submit to the
DBMS transactions T1 and T2 at approximately the same time. If no interleaving of
operations is permitted, there are only two possible outcomes:
1. Execute all the operations of transaction T1 (in sequence) followed by all the
operations of transaction T2 (in sequence).
2. Execute all the operations of transaction T2 (in sequence) followed by all the
operations of transaction T1 (in sequence).

Serial schedule:
A schedule S is serial if, for every transaction T participating in the schedule, all
the operations of T are executed consecutively in the schedule.
Otherwise, the schedule is called nonserial schedule.

Serializable schedule:
A schedule S is serializable if it is equivalent to some serial schedule of the same
n transactions.
Result equivalent:
Two schedules are called result equivalent if they produce the same final state of
the database.
Conflict equivalent:
Two schedules are said to be conflict equivalent if the order of any two conflicting
operations is the same in both schedules.
Conflict serializable:
A schedule S is said to be conflict serializable if it is conflict equivalent to some
Being serializable is not the same as being serial
Being serializable implies that the schedule is a correct schedule.
It will leave the database in a consistent state.
The interleaving is appropriate and will result in a state as if the transactions
were serially executed, yet will achieve efficiency due to concurrent execution.

23. Testing conflict serializability of a Schedule S


For each transaction Ti participating in schedule S,create a node labeled Ti in the
precedence graph.
For each case in S where Tj executes a read_item(X) after Ti executes a
write_item(X),
create an edge (Ti Tj) in the precedence graph.
For each case in S where Tj executes a write_item(X) after Ti executes a read_item
(X)
,create an edge (Ti Tj) in the precedence graph.
For each case in S where Tj executes a write_item(X) after Ti executes a
write_item(X),
create an edge (Ti Tj) in the precedence graph.
The schedule S is serializable if and only if the precedence graph has no cycles.

Fig: Constructing the precedence graphs for schedules A and D from fig 21.5 to test
for conflict
serializability.
(a) Precedence graph for serial schedule A. (b) Precedence graph for serial schedule
B. (c) Precedence graph for schedule C (not serializable).
(d) Precedence graph for schedule D (serializable, equivalent to schedule A).
Another example of serializability testing. (a) The READ and WRITE operations of
three
transactions T1, T2, and T3.
24. What are the views in SQL? Explain with example
A view in SQL is a saved SQL query that acts as a virtual table. Unlike regular
tables, views do not store data themselves. Instead, they dynamically generate
data by executing the SQL query defined in the view each time it is accessed. It can
fetch data from one or more tables and present it in a customized format,
25. In SQL, write the usage of GROUP BY and HAVING clauses with suitable examples
Ans: GROUP BY is a SQL command commonly used to aggregate the data to get
insights from it. There are three phases when you group data:
• Split: the dataset is split up into chunks of rows based on the values of the
variables we have chosen for the aggregation
• Apply: Compute an aggregate function, like average, minimum and maximum,
returning a single value
• Combine: All these resulting outputs are combined in a unique table. In this way,
we’ll have a single value for each modality of the variable of interest
• GROUP BY Clause
• Purpose: Used to group rows that have the same values in specified columns into
summary rows (e.g., total sales by region).
• Commonly Used With: Aggregate functions like COUNT(), SUM(), AVG(), MAX(),
MIN().
• HAVING Clause
• Purpose: Used to filter groups created by GROUP BY. It’s similar to WHERE, but
WHERE filters rows before grouping, while HAVING filters after grouping.

GROUP BY Usage

Scenario: Find the total sales for each product.

SELECT product_id, SUM(sales_amount) AS total_sales


FROM sales
GROUP BY product_id;

• Groups records by product_id.


• Sums the sales_amount for each product.

GROUP BY with HAVING

Scenario: Find products whose total sales are greater than 1000.

SELECT product_id, SUM(sales_amount) AS total_sales

FROM sales

GROUP BY product_id

HAVING SUM(sales_amount) > 1000;

➤ Explanation:

• GROUP BY groups rows by product_id.


• SUM(sales_amount) calculates total sales for each product.
• HAVING filters out groups where total sales are not greater than 1000.

26. Discuss the types of problems that may encounter with transactions
that run concurrently

Temporary Update Problem:


Temporary update or dirty read problem occurs when one transaction updates an
item and fails. But the updated item is used by another transaction before the
item is changed or reverted back to its last value.
Example:

Incorrect Summary Problem:


Consider a situation, where one transaction is applying the aggregate function on
some records while another transaction is updating these records. The aggregate
function may calculate some values before the values have been updated and
others after they are updated.
Example:

Lost Update Problem:


In the lost update problem, an update done to a data item by a transaction is lost
as it is overwritten by the update done by another transaction.
Unrepeatable Read Problem:
The unrepeatable problem occurs when two or more read operations of the same
transaction read different values of the same variable.

Phantom Read Problem:


The phantom read problem occurs when a transaction reads a variable once but
when it tries to read that same variable again, an error occurs saying that the
variable does not exist.
27. Describe the wait-die and wound-wait protocols for deadlock prevention.
Wait-Die Scheme
In this scheme, if a transaction requests to lock a resource (data item), which is
already held with a conflicting lock by another transaction, then one of the two
possibilities may occur −
• If TS(Ti) < TS(Tj) − that is Ti, which is requesting a conflicting lock, is older than T j −
then Ti is allowed to wait until the data-item is available.
• If TS(Ti) > TS(tj) − that is Ti is younger than Tj − then Ti dies. Ti is restarted later with
a random delay but with the same timestamp.
This scheme allows the older transaction to wait but kills the younger one.
Wound-Wait Scheme
In this scheme, if a transaction requests to lock a resource (data item), which is
already held with conflicting lock by some another transaction, one of the two
possibilities may occur −
• If TS(Ti) < TS(Tj), then Ti forces Tj to be rolled back − that is Ti wounds Tj. Tj is
restarted later with a random delay but with the same timestamp.
• If TS(Ti) > TS(Tj), then Ti is forced to wait until the resource is available.
This scheme, allows the younger transaction to wait; but when an older
transaction requests an item held by a younger one, the older transaction forces
the younger one to abort and release the item.
In both the cases, the transaction that enters the system at a later stage is aborted.
Deadlock Avoidance
Aborting a transaction is not always a practical approach. Instead, deadlock
avoidance mechanisms can be used to detect any deadlock situation in advance.
Methods like "wait-for graph" are available but they are suitable for only those
systems where transactions are lightweight having fewer instances of resource. In
a bulky system, deadlock prevention techniques may work well.
Wait-for Graph
This is a simple method available to track if any deadlock situation may arise. For
each transaction entering into the system, a node is created. When a transaction
Ti requests for a lock on an item, say X, which is held by some other transaction T j,
a directed edge is created from Ti to Tj. If Tj releases item X, the edge between
them is dropped and Ti locks the data item.
The system maintains this wait-for graph for every transaction waiting for some
data items held by others. The system keeps checking if there's any cycle in the
graph.

Here, we can use any of the two following approaches −


• First, do not allow any request for an item, which is already locked by another
transaction. This is not always feasible and may cause starvation, where a
transaction indefinitely waits for a data item and can never acquire it.
• The second option is to roll back one of the transactions. It is not always feasible to
roll back the younger transaction, as it may be important than the older one. With
the help of some relative algorithm, a transaction is chosen, which is to be aborted.
This transaction is known as the victim and the process is known as victim
selection.
28. Explain features of DeadLock and Difference between Wait-Die and Wound Wait
Features of Deadlock in a DBMS

1. Mutual Exclusion: Each resource can be held by only one transaction at a time, and
other transactions must wait for it to be released.

2. Hold and Wait: Transactions can request resources while holding on to resources
already allocated to them.

3. No Preemption: Resources cannot be taken away from a transaction forcibly, and the
transaction must release them voluntarily.

4. Circular Wait: Transactions are waiting for resources in a circular chain, where each
transaction is waiting for a resource held by the next transaction in the chain.

5. Indefinite Blocking: Transactions are blocked indefinitely, waiting for resources to


become available, and no transaction can proceed.

6. System Stagnation: Deadlock leads to system stagnation, where no transaction can


proceed, and the system is unable to make any progress.

7. Inconsistent Data: Deadlock can lead to inconsistent data if transactions are unable
to complete and leave the database in an intermediate state.

8. Difficult to Detect and Resolve: Deadlock can be difficult to detect and resolve, as it
may involve multiple transactions, resources, and dependencies.

29. What is Multiple Granularity Locking? How is it implemented using intension


locks? Explain.

Multiple granularity locking is a locking mechanism that provides


different levels of locks for different database objects. It allows for
different locks at different levels of granularity. This mechanism allows
multiple transactions to lock different levels of granularity, ensuring that
conflicts are minimized, and concurrency is maximized.
To illustrate, let's consider a tree structure that has four levels of nodes.
The top level represents the entire database, and below it are nodes of
type "area", which represent specific areas of the database. Each area
has child nodes called "files", and each file represents a specific subset
of data within that area. Importantly, no file can span more than one
area.
Finally, each file has child nodes called "records", which represent
individual units of data within the file. Like files, each record is a child
node of its corresponding file and cannot be present in more than one
file. Therefore, the tree can be divided into the following levels, starting
from the top ?
Multiple granularity locking uses two types of locks
Shared Lock
It allows multiple transactions to read the same data simultaneously. It is used to
prevent other transactions from modifying the data while a transaction is reading it.
Exclusive Lock
It prevents any other transaction from accessing the data. It is used to prevent other
transactions from reading or modifying the data while a transaction is writing to it.
Different Types of Intention Mode Locks in Multiple Granularity
Intention mode locks are a type of lock used in multiple granularity locking that
allows multiple transactions to acquire locks on the same resource, but with
different levels of access.
There are three types of intention mode locks in multiple granularity locking ?
Intent Shared (IS) Locks
This lock is used when a transaction needs to read a resource but does not intend to
modify it. It indicates that the transaction wants to acquire a Shared lock on a
resource.
Intent Exclusive (IX) Locks
This lock is used when a transaction needs to modify a resource but does not intend
to share it. It indicates that the transaction wants to acquire an Exclusive lock on a
resource.
Shared with Intent Exclusive (SIX) Locks
This lock is used when a transaction intends to acquire both Shared and Exclusive
locks on a resource. It indicates that the transaction wants to acquire an Exclusive
lock on a resource after acquiring Shared locks on other resources.
These intention mode locks are used to optimize the locking mechanism in a
database by allowing transactions to acquire locks on multiple resources in a
coordinated manner. They help prevent deadlocks and improve concurrency in a
database system.
The compatibility metrics for these lock modes are described below ?
Compatibility Matrix

IS IX S SIX X

IS YES YES YES YES NO

IX YES YES NO NO NO

S YES NO YES NO NO

SIX YES NO NO NO NO

X NO NO NO NO NO

Intention lock modes are utilized in the multiple-granularity locking protocol to


ensure serializability. According to this protocol, when a transaction (T) attempts to
lock a node, it must adhere to the following guidelines
• Transaction T must follow the lock-compatibility matrix.
• Transaction T must initially lock the root of the tree in any mode.
• Transaction T may only lock a node in S or IS mode if it has already locked the node's
parent in either IS or IX mode.
• Transaction T may only lock a node in IX, SIX, or X mode if it has already locked the
parent of the node in either SIX or IX modes.
• Transaction T may only lock a node if it has not yet unlocked any nodes (i.e., it is two-
phase).
• Transaction T may only unlock a node if it is not currently holding any locks on its
child nodes.

30. Explain about join tables in SQL and OUTER JOIN


An SQL join clause combines records from two or more tables in a database. It
creates a set that
can be saved as a table or used as is. A JOIN is a means for combining fields from
two tables by
using values common to each. SQL specifies four types of JOIN
1. INNER,
2. OUTER
3. EQUIJOIN and
4. NATURAL JOIN

INNER JOIN
An inner join is the most common join operation used in applications and can be
regarded as the
default join-type. Inner join creates a new result table by combining column values
of two tables (A
and B) based upon the join- predicate (the condition). The result of the join can be
defined as the
outcome of first taking the Cartesian product (or Cross join) of all records in the
tables (combining
every record in table A with every record in table B) then return all records which
satisfy the join
predicate
Example: SELECT * FROM employee
INNER JOIN department ON
employee.dno = department.dnumber;

EQUIJOIN and NATURAL JOIN


An EQUIJOIN is a specific type of comparator-based join that uses only equality
comparisons in the
join-predicate. Using other comparison operators (such as <) disqualifies a join as an
equijoin.
NATURAL JOIN is a type of EQUIJOIN where the join predicate arises implicitly by
comparing all
columns in both tables that have the same column-names in the joined tables. The
resulting joined
table contains only one column for each pair of equally named columns.

https://vtucode.in page 24
[BCS403]

Database Management System ]


If the names of the join attributes are not the same in the base relations, it is
possible to rename the
attributes so that they match, and then to apply NATURAL JOIN. In this case, the AS
construct can
be used to rename a relation and all its attributes in the FROM clause.

CROSS JOIN returns the Cartesian product of rows from tables in the join. In other
words, it will
produce rows which combine each row from the first table with each row from the
second table.
OUTER JOIN
An outer join does not require each record in the two joined tables to have a
matching record. The
joined table retains each record-even if no other matching record exists. Outer joins
subdivide
further into
Left outer joins
Right outer joins
Full outer joins
No implicit join-notation for outer joins exists in standard SQL.
MULTIWAY JOIN
It is also possible to nest join specifications; that is, one of the tables in a join may
itself be a joined
table. This allows the specification of the join of three or more tables as a single
joined table, which
is called a multiway join. Example:
number, and the department m
SELECT Pnumber, Dnum, Lname, Address, Bdate
FROM ((PROJECT JOIN DEPARTMENT ON Dnum=Dnumber)
JOIN EMPLOYEE ON Mgr_ssn=Ssn)

31. Explain Aggregrate function in SQL


Aggregate Functions in SQL
Aggregate functions are used to summarize information from multiple tuples into a
single-tuple
summary. A number of built-in aggregate functions exist: COUNT, SUM, MAX, MIN,
and AVG. The
COUNT function returns the number of tuples or values as specified in a query. The
functions SUM,
MAX, MIN, and AVG can be applied to a set or multiset of numeric values and
return, respectively,
the sum, maximum value, minimum value, and average (mean) of those values.
These functions
can be used in the SELECTclause or in a HAVING clause (which we introduce later).
The functions
MAX and MIN can also be used with attributes that have nonnumeric domains if the
domain values
have a total ordering among one another.
Examples
1. Find the sum of the salaries of all employees, the maximum salary, the minimum
salary, and the
average salary.
SELECT SUM (Salary), MAX (Salary), MIN (Salary), AVG (Salary)
FROM EMPLOYEE;
2.From the sum of the salaries of all the employees of the Research department as
well
maximum salary, the minimum salary, and the average salary in this department.
SELECT SUM (Salary), MAX (Salary), MIN (Salary), AVG (Salary)
FROM (EMPLOYEE JOIN DEPARTMENT ON Dno=Dnumber)
WHERE

3. Count the number of distinct salary values in the database.


SELECT COUNT (DISTINCT Salary)
FROM EMPLOYEE;

To retrieve the names of all employees who have two or more dependents
SELECT Lname, Fname
FROM EMPLOYEE
WHERE ( SELECT COUNT (*)
FROM DEPENDENT
WHERE Ssn=Essn ) >= 2;

32. Comparisons Involving NULL and Three-Valued Logic

SQL has various rules for dealing with NULL values. NULL is used to represent a missing
value, but

that it usually has one of three different interpretations value

Example

1. Unknown value database.

2. Unavailable or withheld value. A person has a home phone but does not want it to be

listed, so it is withheld and represented as NULL in the database.

3. Not applicable attribute. An attribute College Degree would be NULL for a person who
has no college degrees because it does not apply to that person.
Each individual NULL value is considered to be different from every other NULL value in
the various database records. When a NULL is involved in a comparison operation, the
result is considered to be UNKNOWN (it may be TRUE or it may be FALSE). Hence, SQL
uses a three-valued logic with values TRUE, FALSE, and UNKNOWN instead of the
standard two-valued (Boolean) logic with values TRUE or FALSE. It is therefore necessary
to define the results (or truth values) of three-valued logical expressions when the
logical connectives AND, OR, and NOT are used

The rows and columns represent the values of the results of comparison conditions,
which would

typically appear in the WHERE clause of an SQL query.

In select-project-join queries, the general rule is that only those combinations of tuples
that evaluate

the logical expression in the WHERE clause of the query to TRUE are selected. Tuple
combinations

that evaluate to FALSE or UNKNOWN are not selected.

SQL allows queries that check whether an attribute value is NULL using the comparison
operators

IS or IS NOT.

Example: Retrieve the names of all employees who do not have supervisors.

SELECT Fname, Lname

FROM EMPLOYEE
WHERE Super_ssn IS NULL;

33. Explain Transaction Processing , Single Versus Multiple Processing


The concept of transaction provides a mechanism for describing logical units of
database processing. Transaction processing systems are systems with large
databases and hundreds of concurrent users executing database transactions.
Examples:
airline reservations
banking
credit card processing,
online retail purchasing,
Stock markets, supermarket checkouts, and many other applications
These systems require high availability and fast response time for hundreds of
concurrent users. A transaction is typically implemented by a computer program,
which includes database commands such as retrievals, insertions, deletions, and
updates.

One criterion for classifying a database system is according to the number of users
who can use the system concurrently
Single-User versus Multiuser Systems
A DBMS is
single-user
- at most one user at a time can use the system
- Eg: Personal Computer System
multiuser
- many users can use the system and hence access the database concurrently
- Eg: Airline reservation database
Concurrent access is possible because of Multiprogramming. Multiprogramming can
be achieved by: interleaved execution
Parallel Processing
Multiprogramming operating systems execute some commands from one process,
then suspend that process and execute some commands from the next process, and
so on A process is resumed at the point where it was suspended whenever it gets its
turn to use the CPU again
Hence, concurrent execution of processes is actually interleaved, as illustrated in
Figure 21.1
Figure 21.1, shows two processes, A and B, executing concurrently in an interleaved
fashion
Interleaving keeps the CPU busy when a process requires an input or output (I/O)
operation, such as reading a block from disk
The CPU is switched to execute another process rather than remaining idle during
I/O time
Interleaving also prevents a long process from delaying other processes.
If the computer system has multiple hardware processors (CPUs), parallel processing
of multiple processes is possible, as illustrated by processes C and D in Figure 21.1
Most of the theory concerning concurrency control in databases is developed in
terms of interleaved concurrency
In a multiuser DBMS, the stored data items are the primary resources that may be
accessed concurrently by interactive users or application programs, which are
constantly retrieving information from and modifying the database.

34. Explain Transactions, Database Items, Read and Write Operations, and DBMS Buffers
A Transaction an executing program that forms a logical unit of database processing
It includes one or more DB access operations such as insertion, deletion,
modification or retrieval operation.
It can be either embedded within an application program using begin transaction
and end transaction statements Or specified interactively via a high level query
language such as SQL
Transaction which do not update database are known as read only transactions.
Transaction which do update database are known as read write transactions.
A database is basically represented as a collection of named data items The size of a
data item is called its granularity.
A data item can be a database record, but it can also be a larger unit such as a whole
disk block, or even a smaller unit such as an individual field (attribute) value of some
record in the database Each data item has a unique name
Basic DB access operations that a transaction can include are:
read_item(X): Reads a DB item named X into a program variable.
write_item(X): Writes the value of a program variable into the DB item named X
Executing read_item(X) include the following steps:
1. Find the address of the disk block that contains item X
2. Copy the block into a buffer in main memory
3. Copy the item X from the buffer to program variable named X.
Executing write_item(X) include the following steps:
1. Find the address of the disk block that contains item X
2. Copy the disk block into a buffer in main memory
3. Copy item X from program variable named X into its correct location in buffer.
4. Store the updated disk block from buffer back to disk (either immediately or
later).
Decision of when to store a modified disk block is handled by recovery manager of
the DBMS in cooperation with operating system.
A DB cache includes a number of data buffers. When the buffers are all occupied a
buffer replacement policy is used to choose one of the buffers to be replaced. EG:
LRU

35 Explain briefly about Binary Lock System

A binary lock can have two states or values: locked and unlocked (or 1

and 0).

If the value of the lock on X is 1, item X cannot be accessed by a database

operation that requests the item

A binary lock can have two states or values: locked and unlocked (or 1

and 0).
If the value of the lock on X is 1, item X cannot be accessed by a database

operation that requests the item

If the simple binary locking scheme described here is used, every transaction must obey

the following rules:

1. A transaction T must issue the operation lock_item(X) before any

read_item(X) or write_item(X) operations are performed in T.

2. A transaction T must issue the operation unlock_item(X) after all

read_item(X) and write_item(X) operations are completed in T.

3. A transaction T will not issue a lock_item(X) operation if it already holds the lock

on item X. 4. A transaction T will not issue an unlock_item(X) operation unless it already


holds the lock on item X.

You might also like