0% found this document useful (0 votes)
8 views78 pages

Bcs403 Dbms Module 4 Note Chptrs

Module 4 covers advanced SQL queries, including handling NULL values, nested queries, and aggregate functions. It also discusses transaction processing concepts, various types of joins, and the creation and management of views in SQL. The module emphasizes the importance of grouping and summarizing data effectively using SQL commands.

Uploaded by

taca23cs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views78 pages

Bcs403 Dbms Module 4 Note Chptrs

Module 4 covers advanced SQL queries, including handling NULL values, nested queries, and aggregate functions. It also discusses transaction processing concepts, various types of joins, and the creation and management of views in SQL. The module emphasizes the importance of grouping and summarizing data effectively using SQL commands.

Uploaded by

taca23cs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 78

Module 4- syllabus

SQL: Advanced Queries: More complex SQL retrieval queries, Specifying


constraints as assertions and action triggers, Views in SQL.

Transaction Processing: Introduction to Transaction Processing,


Transaction and System concepts, Desirable properties of Transactions,
Characterizing schedules based on recoverability, Characterizing
schedules based on Serializability, Transaction support in SQL.

3
Chapter 1: SQL Advances Queries

4
Chapter 1: SQL Advances Queries:
More Complex SQL Retrieval Queries
Comparisons Involving NULL and Three-Valued Logic
SQL has various rules for dealing with NULL values. NULL is used to represent a
missing value, but that it usually has one of three different interpretations—value
Example
1. Unknown value. A person’s date of birth is not known, so it is represented by
NULL in the database.
2. Unavailable or withheld value. A person has a home phone but does not want
it to be listed, so it is withheld and represented as NULL in the database.
3. Not applicable attribute. An attribute CollegeDegree would be NULL for a person
who has no college degrees because it does not apply to that person.
Each individual NULL value is considered to be different from every other NULL value in
the various database records. When a NULL is involved in a comparison operation, the
result is considered to be UNKNOWN (it may be TRUE or it may be FALSE). Hence,
SQL uses a three-valued logic with values TRUE, FALSE, and UNKNOWN instead
of the standard two-valued (Boolean) logic with values TRUE or FALSE. It is
therefore necessary to define the results (or truth values) of three- valued logical
expressions when the logical connectives AND, OR, and NOT are used
Example: Retrieve the names of all employees who do not have supervisors.
SELECT Fname, Lname
FROM EMPLOYEE
WHERE Super_ssn IS NULL;
Nested Queries, Tuples, and Set/Multiset Comparisons

Some queries require that existing values in the database be fetched and then used in
a comparison condition. Such queries can be conveniently formulated by using
nested queries, which are complete select-from-where blocks within the WHERE
clause of another query. That other query is called the outer query

Example1: List the project numbers of projects that have an employee with last
name ‘Smith’ as manager
We make use of comparison
SELECT DISTINCT Pnumber operator IN, which compares a
value v with a set (or multiset) of
FROM PROJECT
values V and evaluates to TRUE if
WHERE Pnumber IN
v is one of the elements in V.
(SELECT Pnumber
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND Lname=‘smith’);
Nested Queries::Comparison Operators

Other comparison operators can be used to compare a single value v to a set or multiset
V. The = ANY (or = SOME) operator returns TRUE if the value v is equal to some value in
the set V and is hence equivalent to IN. The two keywords ANY and SOME have the
same effect. The keyword ALL can also be combined with each of these operators.

For example, the comparison condition (v > ALL V) returns TRUE if the value v is greater
than all the values in the set (or multiset) V. For example is the following query, which
returns the names of employees whose salary is greater than the salary of all the
employees in department 5:

SELECT Lname, Fname


FROM EMPLOYEE
WHERE Salary > ALL ( SELECT Salary
FROM EMPLOYEE
WHERE Dno=5 );
The EXISTS and UNIQUE Functions in SQL
EXISTS Functions
The EXISTS function in SQL is used to check whether the result of a correlated
nested query is
empty (contains no tuples) or not. The result of EXISTS is a Boolean value
• TRUE if the nested query result contains at least one tuple, or
• FALSE if the nested query result contains no tuples.

Example: List the names of managers who have at least one dependent
SELECT Fname, Lname
FROM EMPLOYEE
WHERE EXISTS ( SELECT *
FROM DEPENDENT
WHERE Ssn=Essn )
AND
EXISTS ( SELECT *
FROM DEPARTMENT
WHERE Ssn=Mgr_ssn );

In general, EXISTS(Q) returns TRUE if there is at least one tuple in the result of the nested query Q,
and it returns FALSE otherwise.
NOT EXISTS Functions
NOT EXISTS(Q) returns TRUE if there are no tuples in the result of nested query Q,
and it returns FALSE otherwise.

Example: Retrieve the names of employees who have no dependents.


SELECT Fname, Lname
FROM EMPLOYEE
WHERE NOT EXISTS
( SELECT *
FROM DEPENDENT
WHERE Ssn=Essn );

For each EMPLOYEE tuple, the correlated nested query selects all DEPENDENT
tuples whose Essn value matches the EMPLOYEE Ssn; if the result is empty, no
dependents are related to the employee, so we select that EMPLOYEE tuple and
retrieve its Fname and Lname.
UNIQUE Functions
UNIQUE(Q) returns TRUE if there are no duplicate tuples in the result
of query Q; otherwise, it returns FALSE.
• The UNIQUE constraint ensures that all values in a column are
different.
• Both the UNIQUE and PRIMARY KEY constraints provide a
guarantee for uniqueness for a column or set of columns.
• A PRIMARY KEY constraint automatically has a UNIQUE constraint.
SQL UNIQUE Constraint on ALTER TABLE
SQL UNIQUE Constraint on CREATE TABLE To create a UNIQUE constraint on the
"ID" column when the table is already
CREATE TABLE Persons ( created, use the following SQL:
ID int NOT NULL UNIQUE,
LastName varchar(255) NOT NULL,
FirstName varchar(255), ALTER TABLE Persons
Age int ADD UNIQUE (ID);
);
Explicit Sets and Renaming of Attributes in SQL

IN SQL it is possible to use an In SQL, it is possible to rename any


explicit set of values in the attribute that appears in the result of
WHERE clause, rather than a a query by adding the qualifier AS
nested query. Such a set is followed by the desired new name
enclosed in parentheses.
Example: Retrieve the Social Example: Retrieve the last name
Security numbers of all of each employee and his or her
employees who work on supervisor
project numbers 1, 2, or 3. SELECT E.Lname AS
Employee_name, S.Lname AS
SELECT DISTINCT Essn Supervisor_name
FROM WORKS_ON FROM EMPLOYEE AS E,
WHERE Pno IN (1, 2, 3); EMPLOYEE AS S
WHERE E.Super_ssn=S.Ssn;
Join in SQL
A JOIN is a means for combining fields from two tables by using values common
to each. SQL specifies four types of JOIN
1. INNER,
2. OUTER
3. EQUIJOIN and
4. NATURAL JOIN
INNER JOIN
An inner join is the most common join operation
used in applications and can be regarded as the Example: SELECT * FROM employee
default join-type. Inner join creates a new result
table by combining column values of two tables INNER JOIN department ON
(A and B) based upon the join- predicate (the
employee.dno = department.dnumber;
condition).
The result of the join can be defined as the
outcome of first taking the Cartesian product (or
Cross join) of all records in the tables (combining
every record in table A with every record in table
B)—then return all records which satisfy the join
predicate
EQUIJOIN and NATURAL JOIN
An EQUIJOIN is a specific type of comparator-based join that uses only equality
comparisons in the join-predicate. Using other comparison operators (such as <)
disqualifies a join as an equijoin.

NATURAL JOIN is a type of EQUIJOIN where the join predicate arises implicitly by
comparing all columns in both tables that have the same column-names in the joined
tables. The resulting joined table contains only one column for each pair of equally named
columns.
OUTER JOIN
An outer join does not require each record in the two joined tables to have a matching
record. The joined table retains each record-even if no other matching record exists.
Outer joins subdivide further into
• Left outer joins
• Right outer joins
• Full outer joins

SQL LEFT JOIN


It joins two tables based on a common column, and selects records that
have matching values in these columns and remaining rows from the left
table.
SQL RIGHT JOIN
The SQL RIGHT JOIN joins two tables based on a common column, and
selects records that have matching values in these columns and remaining
rows from the right table.
Example
SELECT Customers.customer_id,
Customers.first_name, Orders.amount
FROM Customers
LEFT JOIN
Orders
ON Customers.customer_id = Orders.customer;

Example
SELECT Customers.customer_id, Customers.first_name, Orders.amount
FROM Customers
RIGHT JOIN
Orders
ON Customers.customer_id = Orders.customer;
LEFT JOIN With WHERE Clause
The SQL command can have an optional WHERE clause with the LEFT
JOIN statement. For example,
Example
SELECT Customers.customer_id,
Customers.first_name, Orders.amount
FROM Customers
LEFT JOIN Orders
ON Customers.customer_id = Example
Orders.customer SELECT Customers.customer_id,
WHERE Orders.amount >= 500; Customers.first_name, Orders.amount
FROM Customers
RIGHT JOIN Orders
ON Customers.customer_id =
Orders.customer
WHERE Orders.amount >= 500;
SQL FULL OUTER JOIN
It joins two tables based on a common column, and selects records that
have matching values in these columns and remaining rows from both of
the tables.

Syntax of FULL OUTER JOIN FULL OUTER JOIN With WHERE Clause
The SQL command can have an optional WHERE
clause with the FULL OUTER JOIN statement.
SELECT columns For example,
FROM table1
FULL OUTER JOIN table2 SELECT Customers.customer_id,
ON table1.column_name = Customers.first_name, Orders.amount
table2.column_name; FROM Customers
FULL OUTER JOIN Orders
ON Customers.customer_id =
Orders.customer
WHERE Orders.amount >= 500;
Aggregate Functions in SQL

Aggregate functions are used to summarize information from multiple


tuples into a single-tuple summary. A number of built-in aggregate
functions exist: COUNT, SUM, MAX, MIN, and AVG. The COUNT
function returns the number of tuples or values as specified in a
query. The functions SUM, MAX, MIN, and AVG can be applied to a
set or multiset of numeric values and return, respectively, the sum,
maximum value, minimum value, and average (mean) of those
values. These functions can be used in the SELECT clause or in a
HAVING clause. The functions MAX and MIN can also be used with
attributes that have nonnumeric domains if the domain values have a
total ordering among one another.
Examples

1. Find the sum of the salaries of all employees, the maximum salary, the
minimum salary, and the average salary.
SELECT SUM (Salary), MAX (Salary), MIN (Salary), AVG (Salary)
FROM EMPLOYEE;

2. Find the sum of the salaries of all employees of the ‘Research’ department, as
well as the maximum salary, the minimum salary, and the average salary in this
department.
SELECT SUM (Salary), MAX (Salary), MIN (Salary), AVG (Salary)
FROM (EMPLOYEE JOIN DEPARTMENT ON Dno=Dnumber)
WHERE Dname=‘Research’;

3. Count the number of distinct salary values in the database.


SELECT COUNT (DISTINCT Salary)
FROM EMPLOYEE;
Grouping: The GROUP BY and HAVING Clauses

Grouping is used to create subgroups of tuples before summarization.


For example, we may want to find the average salary of employees in each
department or the number of employees who work on each project. In these cases we
need to partition the relation into non overlapping subsets (or groups) of tuples.

Each group (partition) will consist of the tuples that have the same value of some
attribute(s), called the grouping attribute(s).
SQL has a GROUP BY clause for this purpose. The GROUP BY clause specifies the
grouping attributes, which should also appear in the SELECT clause, so that the value
resulting from applying each aggregate function to a group of tuples appears along
with the value of the grouping attribute(s).
Example: For each department, retrieve the department number, the number of
employees in the department, and their average salary.
SELECT Dno, COUNT (*), AVG (Salary)
FROM EMPLOYEE
GROUP BY Dno;
HAVING provides a condition on the summary information
regarding the group of tuples associated with each value of the
grouping attributes. Only the groups that satisfy the condition are
retrieved in the result of the query.

Example: For each project on which more than two employees work,
retrieve the project number, the project name, and the number of
employees who work on the project.

SELECT Pnumber, Pname, COUNT (*)


FROM PROJECT, WORKS_ON
WHERE Pnumber=Pno
GROUP BY Pnumber, Pname
HAVING COUNT (*) > 2;
Summary of SQL Queries

• The SELECT clause lists the attributes or functions to be retrieved.


• The FROM clause specifies all relations (tables) needed in the query, including
joined relations, but not those in nested queries.
• The WHERE clause specifies the conditions for selecting the tuples from these
relations, including join conditions if needed.
• GROUP BY specifies grouping attributes, whereas HAVING specifies a condition on
the groups being selected rather than on the individual tuples.
• Finally, ORDER BY specifies an order for displaying the result of a query.
Views (Virtual Tables) in SQL

A view in SQL terminology is a single table that is derived from


other tables. other tables can be base tables or previously
defined views. A view does not necessarily exist in physical form;
it is considered to be a virtual table, in contrast to base tables,
whose tuples are always physically stored in the database. This
limits the possible update operations that can be applied to views,
but it does not provide any limitations on querying a view.

For example, referring to the COMPANY database, we may frequently issue queries that
retrieve the employee name and the project names that the employee works on. Rather
than having to specify the join of the three tables EMPLOYEE,WORKS_ON, and
PROJECT every time we issue this query, we can define a view that is specified as the
result of these joins. Then we can issue queries on the view, which are specified as single
table retrievals rather than as retrievals involving two joins on three tables. We call the
EMPLOYEE,WORKS_ON, and PROJECT tables the defining tables of the view.
Specification of Views in SQL

In SQL, the command to specify a view is CREATE VIEW. The view is given a (virtual)
table name (or view name), a list of attribute names, and a query to specify the contents
of the view
Example 1:
CREATE VIEW WORKS_ON1
AS SELECT Fname, Lname, Pname, Hours
FROM EMPLOYEE, PROJECT, WORKS_ON
WHERE Ssn=Essn AND Pno=Pnumber;
We can now specify SQL queries on a view—or virtual table—in the same way we specify
queries involving base tables. For example, to retrieve the last name and first name of all
employees who work on the ‘ProductX’ project, we can utilize the WORKS_ON1 view and
specify the query as :
SELECT Fname, Lname
FROM WORKS_ON1
WHERE Pname=‘ProductX’;
View Implementation, View Update and Inline Views
The problem of efficiently implementing a view for querying is complex. Two main
approaches have been suggested.
• One strategy, called query modification, involves modifying or transforming the
view query (submitted by the user) into a query on the underlying base tables. For
example, the query
SELECT Fname, Lname
FROM WORKS_ON1
WHERE Pname=‘ProductX’;

would be automatically modified to the following query by the DBMS:


SELECT Fname, Lname
FROM EMPLOYEE, PROJECT, WORKS_ON
WHERE Ssn=Essn AND Pno=Pnumber
AND Pname=‘ProductX’;

The disadvantage of this approach is that it is inefficient for views defined via complex
queries that are time-consuming to execute, especially if multiple queries are going to
be applied to the same view within a short period of time.
The second strategy, called view materialization, involves physically creating a
temporary view table when the view is first queried and keeping that table on the
assumption that other queries on the view will follow. In this case, an efficient strategy for
automatically updating the view table when the base tables are updated must be
developed in order to keep the view up-to-date.

Techniques using the concept of incremental update have been developed for this
purpose, where the DBMS can determine what new tuples must be inserted, deleted, or
modified in a materialized view table when a database update is applied to one of the
defining base tables.

Updating of views
Updation of view is complicated and can be ambiguous. In general, an update on a view
defined on a single table without any aggregate functions can be mapped to an update on the
underlying base table under certain conditions. For a view involving joins, an update operation may
be mapped to update operations on the underlying base relations in multiple ways. Hence, it is often
not possible for the DBMS to determine which of the updates is intended.
To illustrate potential problems with updating a view defined on multiple tables,
consider the WORKS_ON1 view, and suppose that we issue the command to update
the PNAME attribute of ‘John Smith’ from ‘ProductX’ to ‘ProductY’.

This view update is shown in UV1:

UV1:

UPDATE WORKS_ON1
SET Pname = ‘ProductY’
WHERE Lname=‘Smith’ AND Fname=‘John’
AND Pname=‘ProductX’;

This query can be mapped into several updates on the base


relations to give the desired update effect on the view.
inline view

It is a SELECT statement in the FROM clause. As mentioned in the


View section, a view is a virtual table that has the characteristics of a
table yet does not hold any actual data. In an inline view construct,
instead of specifying table name(s) after the FROM keyword, the
source of the data actually comes from the inline view.

Inline view is sometimes referred to as derived table. These two


terms are used interchangeably.

Syntax
The syntax for an inline view is,

SELECT "column_name" FROM (Inline View);


Example
Assume we have two tables: The first table is User_Address, which maps each user to a
ZIP code; the second table is User_Score, which records all the scores of each user. The
question is, how to write a SQL query to find the number of users who scored higher than
200 for each ZIP code?

Without using an inline view, we can accomplish this in two steps:

Query 1
CREATE TABLE User_Higher_Than_200 In the code, we introduced a
SELECT User_ID, SUM(Score) FROM User_Score temporary table,
GROUP BY User_ID User_Higher_Than_200, to
HAVING SUM(Score) > 200; store the list of users who
scored higher than 200.
Query 2 User_Higher_Than_200 is
SELECT a2.ZIP_CODE, COUNT(a1.User_ID) then used to join to the
FROM User_Higher_Than_200 a1, User_Address a2 User_Address table to get the
WHERE a1.User_ID = a2.User_ID final result.
GROUP BY a2.ZIP_CODE;
We can simplify the above SQL using the inline view construct as follows:

Query 3

SELECT a2.ZIP_CODE, COUNT(a1.User_ID)


FROM
(SELECT User_ID, SUM(Score) FROM User_Score GROUP BY User_ID HAVING
SUM(Score) > 200) a1,
User_Address a2
WHERE a1.User_ID = a2.User_ID
GROUP BY a2.ZIP_CODE;

The code that is in red represents an inline view. There are two advantages on
using inline view here:

1. We do not need to create the temporary table. This prevents the database from
having too many objects, which is a good thing as each additional object in the
database costs resources to manage.

2. We can use a single SQL query to accomplish what we want.


Specifying Constraints as Assertions and Actions as Triggers

 Specifying General Constraints as Assertions in SQL


Assertions are used to specify additional types of constraints outside
scope of built-in relational model constraints. In SQL, users can specify
general constraints via declarative assertions, using the CREATE
ASSERTION statement of the DDL.Each assertion is given a constraint
name and is specified via a condition similar to the WHERE clause of an
SQL query.

General form :
CREATE ASSERTION <Name_of_assertion> CHECK (<cond>)

For the assertion to be satisfied, the condition specified after CHECK


clause must return true.
For example, to specify the constraint that the salary of an employee must not be
greater than the salary of the manager of the department that the employee works for in
SQL, we can write the following assertion:

CREATE ASSERTION SALARY_CONSTRAINT


CHECK ( NOT EXISTS
( SELECT * FROM EMPLOYEE E, EMPLOYEE M, DEPARTMENT D
WHERE E.Salary>M.Salary
AND
E.Dno=D.Dnumber AND D.Mgr_ssn=M.Ssn )
);
The constraint name SALARY_CONSTRAINT is followed by the keyword CHECK,
which is followed by a condition in parentheses that must hold true on every database
state for the assertion to be satisfied. The constraint name can be used later to refer to
the constraint or to modify or drop it. Any WHERE clause condition can be used, but
many constraints can be specified using the EXISTS and NOT EXISTS style of SQL
conditions.
By including this query inside a NOT EXISTS clause, the assertion will specify that the
result of this query must be empty so that the condition will always be TRUE. Thus, the
assertion is violated if the result of the query is not empty
Example: consider the bank database with the following tables

1. Write an assertion to specify the constraint that the Sum of loans taken by a customer
does not exceed 100,000

CREATE ASSERTION sumofloans


CHECK (100000> = ALL
SELECT customer_name,sum(amount)
FROM borrower b, loan l
WHERE b.loan_number=l.loan_number
GROUP BY customer_name );
2. Write an assertion to specify the constraint that the Number of
accounts for each customer in a given branch is at most two
CREATE ASSERTION NumAccounts
CHECK ( 2 >= ALL
SELECT customer_name,branch_name, count(*)
FROM account A , depositor D
WHERE A.account_number = D.account_number
GROUP BY customer_name, branch_name );
 Introduction to Triggers in SQL

A trigger is a procedure that runs automatically when a certain event


occurs in the DBMS. In many cases it is convenient to specify the type of
action to be taken when certain events occur and when certain conditions
are satisfied. The CREATE TRIGGER statement is used to implement
such actions in SQL.

General form:
CREATE TRIGGER <name>
BEFORE | AFTER | <events>
FOR EACH ROW |FOR EACH
STATEMENT
WHEN (<condition>)
<action>
A trigger has three components

1. Event: When this event happens, the trigger is activated


Three event types : Insert, Update, Delete
Two triggering times: Before the event, After the event

2. Condition (optional): If the condition is true, the trigger executes, otherwise skipped
3. Action: The actions performed by the trigger
When the Event occurs and Condition is true, execute the Action
Assertions vs. Triggers
Assertions do not modify the data, they only check certain conditions. Triggers are more
powerful because the can check conditions and also modify the data
Assertions are not linked to specific tables in the database and not linked to specific events.
Triggers are linked to specific tables and specific events
All assertions can be implemented as triggers (one or more). Not all triggers can be
implemented as assertions
Chapter 2: Transaction Processing

44
Chapter 2: Transaction Processing
Transaction:
A transaction is a program including a collection of database operations, executed as a
logical unit of data processing. The operations performed in a transaction include one or
more of database operations like insert, delete, update or retrieve data. It is an atomic
process that is either performed into completion entirely or is not performed at all. Each
high level operation can be divided into a number of low level tasks or operations. For
example, a data update operation can be divided into three tasks −
read_item() − reads data item from storage to main memory.
modify_item() − change value of item in the main memory.
write_item() − write the modified value from main memory to storage.
The Figure , shows two processes, A
and B, executing concurrently in an
interleaved fashion
Interleaving keeps the CPU busy when a
process requires an input or output (I/O)
operation, such as reading a block from disk
The CPU is switched to execute another
process rather than remaining idle during I/O
time

45
Basic DB access operations that a transaction can include are:

• read_item(X): Reads a DB item named X into a program variable.


• write_item(X): Writes the value of a program variable into the DB item named X

46
Question:
Why concurrency control and recovery are needed in DBMS?
(or)
List & Explain the types of problem that may occur when 2 transactions
run concurrently.
(or)
What are the anomalies that can occur due to interleaved execution?
(or)
What are the types of problems that may occur when 2 transactions run
concurrently?

47
Why Concurrency Control Is Needed

Several problems can occur when concurrent


transactions execute in an uncontrolled manner
Example:
•We consider an Airline reservation DB
•Each records is stored for an airline flight which
includes Number of reserved seats among other
information.
Types of problems we may encounter:
1. The Lost Update Problem[WW Transaction T1
•transfers N reservations from one flight
conflict] whose number of reserved seats is stored
2. The Temporary Update/ Dirty in the database item named X to another
Read Problem[WR conflict] flight whose number of reserved seats is
stored in the database item named Y.
3. The Incorrect Summary Problem Transaction T2
4. The Unrepeatable Read Problem •reserves M seats on the first flight (X)
[RW conflict]
48
1.The Lost Update Problem [WW conflict]
Occurs when two transactions that access the same DB items have their operations
interleaved in a way that makes the value of some DB item incorrect
Suppose that transactions T1 and T2 are submitted at approximately the same time,
and suppose that their operations are interleaved as shown in Figure below

Final value of item X is incorrect because T2


reads the value of X before T1 changes it in the
database, and hence the updated value
resulting from T1 is lost.

For example:
X = 80 at the start (there were 80 reservations on the flight)
N = 5 (T1 transfers 5 seat reservations from the flight corresponding to X to the flight corresponding to Y)
M = 4 (T2 reserves 4 seats on X) The final result should be X = 79.
The interleaving of operations shown in Figure is X = 84 because the update in T1 that removed
the five seats from X was lost.
49
2.The Temporary Update / Dirty Read Problem [WR conflict]
occurs when one transaction updates a database item and then the transaction fails for some
reason before doing commit.
Meanwhile the updated item is accessed by another transaction before it is changed back to
its original value

50
3.The Incorrect Summary Problem
•If one transaction is calculating an aggregate summary function on a number of DB items while
other transactions are updating some of these items, the aggregate function may calculate some
values before they are updated and others after they are updated.

51
4.The Unrepeatable Read Problem [RW conflict]
Transaction T reads the same item twice and gets different values on
each read, since the item was modified by another transaction T` between the
two reads.
for example, if during an airline reservation transaction, a customer inquires about
seat availability on several flights
When the customer decides on a particular flight, the transaction then reads the
number of seats on that flight a second time before completing the reservation, and it may end
up reading a different value for the item.
Why Recovery Is Needed
Whenever a transaction is submitted to a DBMS for execution, the system is
responsible for making sure that either:
• All the operations in the transaction are completed successfully and their effect is recorded
permanently in the database ( The transaction is committed) or
• The transaction does not have any effect on the database or any other transactions
In the first case, the transaction is said to be committed, whereas in the second case, the
transaction is aborted
If a transaction fails after executing some of its operations but before executing all of them,
the operations already executed must be undone and have no lasting effect.

52
Question:

With a neat diagram explain transition diagram of a


transaction

53
Transaction States and Operations

A transaction is an atomic unit of work that should either be completed in its entirety
or not done at all. For recovery purposes, the system keeps track of start of a
transaction, termination, commit or aborts.

1. BEGIN_TRANSACTION: marks the beginning of transaction execution


2. READ or WRITE: specify read or write operations on the database items that
are executed as part of a transaction
3. END_TRANSACTION: specifies that READ and WRITE transaction operations
have ended and marks the end of transaction execution
4. COMMIT_TRANSACTION: signals a successful end of the transaction so that
any changes (updates) executed by the transaction can be safely committed to
the database and will not be undone
5. ROLLBACK: signals that the transaction has ended unsuccessfully, so that any
changes or effects that the transaction may have applied to the database must
be undone

54
Figure: State transition diagram illustrating the states for transaction execution
A transaction goes into active state immediately after it starts execution and can execute read and write
operations.
When the transaction ends it moves to partially committed state.
At this end additional checks are done to see if the transaction can be committed or not. If these checks are
successful the transaction is said to have reached commit point and enters committed state. All the changes are
recorded permanently in the db.
A transaction can go to the failed state if one of the checks fails or if the transaction is aborted during its active
state. The transaction may then have to be rolled back to undo the effect of its write operation.
Terminated state corresponds to the transaction leaving the system. All the information about the transaction is
removed from system tables.
55
Question:
Explain ACID properties/ desirable properties of Transaction

56
Desirable Properties of Transactions (ACID Properties)
Transactions should possess several properties, often called the ACID properties
A Atomicity:
a transaction is an atomic unit of processing and it is either performed entirely or not at all.
C Consistency Preservation:
a transaction should be consistency preserving that is it must take the database from one consistent
state to another.
I Isolation/Independence:
A transaction should appear as though it is being executed in isolation from other transactions, even
though many transactions are executed concurrently.
D Durability (or Permanency):
if a transaction changes the database and is committed, the changes must never be lost because of
any failure.
The atomicity property requires that we execute a transaction to completion. It is the responsibility of
the transaction recovery subsystem of a DBMS to ensure atomicity.
The preservation of consistency is generally considered to be the responsibility of the programmers
who write the database programs or of the DBMS module that enforces integrity constraints.
The isolation property is enforced by the concurrency control subsystem of the DBMS. If every
transaction does not make its updates (write operations) visible to other transactions until it is
committed, one form of isolation is enforced that solves the temporary update problem and eliminates
cascading rollbacks
Durability is the responsibility of recovery subsystem.
57
Question:
Write a short note on System log

58
The System Log

The system log that is generally written on stable storage contains the
redundant data required to recover from volatile storage failures and as well from
errors discovered by the transaction or the database system. System log is as well
known as log and has sometimes been called the DBMS journal. It consists of the
following entries (also known as log records):

a. [start_transaction, T]: Points out that transaction T has started execution.

b. [write_item, T, X, old_value, new_value]: Points out that transaction T has changed


the value of database item X from old_value to new_value.

c. [read_item, T, X]: Points out that transaction T has read the value of database item X.

d. [commit, T]: Points out that transaction T has completed successfully, and affirms that
its effect can be committed (recorded permanently) to the database.

e. [abort, T]: Points out that transaction T has been aborted.


59
Characterizing Schedules Based on Recoverability

Once a transaction T is committed, it should never be necessary to roll back T.


This ensures that the durability property of trans-actions is not violated.
The schedules that theoretically meet this criterion are called
recoverable schedules; those that do not are called nonrecoverable and
hence should not be permitted by the DBMS.

The definition of recoverable schedule is as follows: A schedule S is


recoverable if no transaction T in S commits until all transactions T that
have written some item X that T reads have committed. A transaction T
reads from transaction T in a schedule S if some item X is first written
by T and later read by T. In addition, T should not have been aborted
before T reads item X, and there should be no transactions that write X
after T writes it and before T reads it

60
Consider the schedule Sa given below, which is the same as schedule Sa except that two
commit operations have been added to Sa:

Sa : r1(X); r2(X); w1(X); r1(Y); w2(X); c2; w1(Y); c1;

Sa is recoverable, even though it suffers from the lost update problem; this problem is
handled by serializability theory (see Section 21.5). However, consider the two (partial)
schedules Sc and Sd that follow:

Sc: r1(X); w1(X); r2(X); r1(Y); w2(X); c2; a1;

Sd: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); c1; c2; Se: r1(X); w1(X); r2(X); r1(Y);
w2(X); w1(Y); a1; a2;

Sc is not recoverable because T2 reads item X from T1, but T2 commits before T1
commits. The problem occurs if T1 aborts after the c2 operation in Sc, then the value of X
that T2 read is no longer valid and T2 must be aborted after it is committed, leading to a
schedule that is not recoverable. For the schedule to be recoverable, the c2 operation in
Sc must be postponed until after T1 commits, as shown in Sd.

61
Recoverable Schedule

A schedule where no committed transactions need to be rolled back


A transaction T must not commit until all transactions T’ that have written
an item that T reads have committed
Examples: ( c: commit, a: Abort)
r1(X) w1(X) r2(X) r1(Y) w2(X) c2 a1
Nonrecoverable (T2 must be rolled back when T1 aborts)
r1(X) r2(X) w1(X) r1(Y) w2(X) c2 w1(Y) a1
Recoverable (T2 does not have to be rolled back when T 1 aborts)
r2(X) w2(X) r1(X) r1(Y) w1(X) c2 w1(Y) a1
Recoverable (T2 does not have to be rolled back when T 1 aborts)
Strict Schedules

A schedule in which we can restore the database to a consistent state


after abort using the before image of data item

A schedule in which a transaction can neither read nor write an item X


until the last transaction that wrote X has committed or aborted

Example:
r2(X) r1(X) w1(X) w2(X) a1

Schedule is cascadeless but not strict


Summary
• Recoverable schedules: no need to rollback committed transactions
• Cascadeless schedules: no cascading rollback (rollback only the
aborted transaction)
• Strict schedules: undo changes by aborted transaction by applying
the before image of affected data items

• Cascadeless schedules are recoverable


• Strict schedules are cascadeless and recoverable

More stringent condition means easier to do recovery from failure but


less concurrency
Characterizing Schedules Based on Serializability
Schedule (or history): the order of execution of operations from all
the various transactions. Schedules that are always considered to be
correct when concurrent transactions are executing are known as
serializable schedules

Suppose that two users—for example, two airline reservations


agents—submit to the DBMS transactions T1 and T2 at approximately
the same time. If no interleaving of operations is permitted, there are
only two possible outcomes:

1. Execute all the operations of transaction T1 (in sequence) followed


by all the operations of transaction T2 (in sequence).

2. Execute all the operations of transaction T2 (in sequence)


followed by all the operations of transaction T1 (in sequence).
65
66
Serial schedule:
– A schedule S is serial if, for every transaction T participating in the schedule, all
the operations of T are executed consecutively in the schedule.
• Otherwise, the schedule is called non serial schedule.

Serializable schedule:
– A schedule S is serializable if it is equivalent to some serial schedule of the
same n transactions.

Result equivalent:
– Two schedules are called result equivalent if they produce the same final state
of the database.

Conflict equivalent:
– Two schedules are said to be conflict equivalent if the order of any two
conflicting operations is the same in both schedules.

Conflict serializable:
– A schedule S is said to be conflict serializable if it is conflict equivalent to some
serial schedule S’.
67
Question:
Problems related to Testing conflict serializability of a Schedule using
precedence graph

68
Testing conflict serializability of a Schedule S

For each transaction Ti participating in schedule S,create a node labeled Ti in the


precedence graph.
For each case in S where Tj executes a read_item(X) after Ti executes a
write_item(X), create an edge (TiTj) in the precedence graph.
For each case in S where Tj executes a write_item(X) after Ti executes a
read_item
(X) ,create an edge (TiTj) in the precedence graph.
For each case in S where Tj executes a write_item(X) after Ti executes a
write_item(X), create an edge (TiTj) in the precedence graph.
The schedule S is serializable if and only if the precedence graph has no
cycles.

69
70
(a)Precedence graph for serial schedule A.
(b)Precedence graph for serial schedule B.
(c)Precedence graph for schedule C (not
serializable).
Fig: Constructing the precedence graphs for (d)Precedence graph for schedule D
schedules A and D from fig 21.5 to test for conflict (serializable, equivalent to schedule A).
serializability.

71
Example of serializability testing. (a) The READ and WRITE operations
of three transactions T1, T2, and T3.

72
Draw Precedence graph for schedule E

73
 Precedence graph for schedule E

74
 Precedence graph for schedule F

75
 Precedence graph for schedule F

76
Question:
Briefly Explain Transaction support in SQL

77
Transaction Support in SQL
The basic definition of an SQL transaction is, it is a logical unit of work and is
guaranteed to be atomic
A single SQL statement is always considered to be atomic—either it completes
execution without an error or it fails and leaves the database unchanged.
Every transaction must have an explicit end statement, which is either a COMMIT or a
ROLLBACK.

The characteristics are :


•The access mode
- can be specified as READ ONLY or READ WRITE
- The default is READ WRITE
- A mode of READ WRITE allows select, update, insert, delete, and create
commands to be executed
- A mode of READ ONLY, as the name implies, is simply for data retrieval.
•The diagnostic area size
- DIAGNOSTIC SIZE n, specifies an integer value n, which indicates the
number of conditions that can be held simultaneously in the diagnostic area.

78
•The isolation level
- specified using the statement ISOLATION LEVEL <isolation>, where the value
for <isolation> can be READ UNCOMMITTED, READ COMMITTED, REPEATABLE
READ, or SERIALIZABLE
- The default isolation level is SERIALIZABLE

If a transaction executes at a lower isolation level than SERIALIZABLE, then one or


more of the following three violations may occur:
1. Dirty read: A transaction T1 may read the update of a transaction T2, which has
not yet committed.
2. Nonrepeatable read: A transaction T1 may read a given value from a table. If
another transaction T2 later updates that value and T1 reads that value again, T1
will see a different value.
3. Phantoms: A transaction T1 may read a set of rows from a table, perhaps based on
some condition specified in the SQL WHERE-clause. Now suppose that a transaction
T2 inserts a new row that also satisfies the WHERE-clause condition used in T1, into
the table used by T1. If T1 is repeated, then T1 will see a phantom, a row that
previously did not exist.

79
The transaction consists of first inserting a new row in
the EMPLOYEE table and then updating the salary of all
employees who work in department 2
If an error occurs on any of the SQL statements, the
entire transaction is rolled back
This implies that any updated salary (by this
transaction) would be restored to its previous value and that
the newly inserted row would be removed.

80

You might also like