Bcs403 Dbms Module 4 Note Chptrs
Bcs403 Dbms Module 4 Note Chptrs
3
Chapter 1: SQL Advances Queries
4
Chapter 1: SQL Advances Queries:
More Complex SQL Retrieval Queries
Comparisons Involving NULL and Three-Valued Logic
SQL has various rules for dealing with NULL values. NULL is used to represent a
missing value, but that it usually has one of three different interpretations—value
Example
1. Unknown value. A person’s date of birth is not known, so it is represented by
NULL in the database.
2. Unavailable or withheld value. A person has a home phone but does not want
it to be listed, so it is withheld and represented as NULL in the database.
3. Not applicable attribute. An attribute CollegeDegree would be NULL for a person
who has no college degrees because it does not apply to that person.
Each individual NULL value is considered to be different from every other NULL value in
the various database records. When a NULL is involved in a comparison operation, the
result is considered to be UNKNOWN (it may be TRUE or it may be FALSE). Hence,
SQL uses a three-valued logic with values TRUE, FALSE, and UNKNOWN instead
of the standard two-valued (Boolean) logic with values TRUE or FALSE. It is
therefore necessary to define the results (or truth values) of three- valued logical
expressions when the logical connectives AND, OR, and NOT are used
Example: Retrieve the names of all employees who do not have supervisors.
SELECT Fname, Lname
FROM EMPLOYEE
WHERE Super_ssn IS NULL;
Nested Queries, Tuples, and Set/Multiset Comparisons
Some queries require that existing values in the database be fetched and then used in
a comparison condition. Such queries can be conveniently formulated by using
nested queries, which are complete select-from-where blocks within the WHERE
clause of another query. That other query is called the outer query
Example1: List the project numbers of projects that have an employee with last
name ‘Smith’ as manager
We make use of comparison
SELECT DISTINCT Pnumber operator IN, which compares a
value v with a set (or multiset) of
FROM PROJECT
values V and evaluates to TRUE if
WHERE Pnumber IN
v is one of the elements in V.
(SELECT Pnumber
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND Lname=‘smith’);
Nested Queries::Comparison Operators
Other comparison operators can be used to compare a single value v to a set or multiset
V. The = ANY (or = SOME) operator returns TRUE if the value v is equal to some value in
the set V and is hence equivalent to IN. The two keywords ANY and SOME have the
same effect. The keyword ALL can also be combined with each of these operators.
For example, the comparison condition (v > ALL V) returns TRUE if the value v is greater
than all the values in the set (or multiset) V. For example is the following query, which
returns the names of employees whose salary is greater than the salary of all the
employees in department 5:
Example: List the names of managers who have at least one dependent
SELECT Fname, Lname
FROM EMPLOYEE
WHERE EXISTS ( SELECT *
FROM DEPENDENT
WHERE Ssn=Essn )
AND
EXISTS ( SELECT *
FROM DEPARTMENT
WHERE Ssn=Mgr_ssn );
In general, EXISTS(Q) returns TRUE if there is at least one tuple in the result of the nested query Q,
and it returns FALSE otherwise.
NOT EXISTS Functions
NOT EXISTS(Q) returns TRUE if there are no tuples in the result of nested query Q,
and it returns FALSE otherwise.
For each EMPLOYEE tuple, the correlated nested query selects all DEPENDENT
tuples whose Essn value matches the EMPLOYEE Ssn; if the result is empty, no
dependents are related to the employee, so we select that EMPLOYEE tuple and
retrieve its Fname and Lname.
UNIQUE Functions
UNIQUE(Q) returns TRUE if there are no duplicate tuples in the result
of query Q; otherwise, it returns FALSE.
• The UNIQUE constraint ensures that all values in a column are
different.
• Both the UNIQUE and PRIMARY KEY constraints provide a
guarantee for uniqueness for a column or set of columns.
• A PRIMARY KEY constraint automatically has a UNIQUE constraint.
SQL UNIQUE Constraint on ALTER TABLE
SQL UNIQUE Constraint on CREATE TABLE To create a UNIQUE constraint on the
"ID" column when the table is already
CREATE TABLE Persons ( created, use the following SQL:
ID int NOT NULL UNIQUE,
LastName varchar(255) NOT NULL,
FirstName varchar(255), ALTER TABLE Persons
Age int ADD UNIQUE (ID);
);
Explicit Sets and Renaming of Attributes in SQL
NATURAL JOIN is a type of EQUIJOIN where the join predicate arises implicitly by
comparing all columns in both tables that have the same column-names in the joined
tables. The resulting joined table contains only one column for each pair of equally named
columns.
OUTER JOIN
An outer join does not require each record in the two joined tables to have a matching
record. The joined table retains each record-even if no other matching record exists.
Outer joins subdivide further into
• Left outer joins
• Right outer joins
• Full outer joins
Example
SELECT Customers.customer_id, Customers.first_name, Orders.amount
FROM Customers
RIGHT JOIN
Orders
ON Customers.customer_id = Orders.customer;
LEFT JOIN With WHERE Clause
The SQL command can have an optional WHERE clause with the LEFT
JOIN statement. For example,
Example
SELECT Customers.customer_id,
Customers.first_name, Orders.amount
FROM Customers
LEFT JOIN Orders
ON Customers.customer_id = Example
Orders.customer SELECT Customers.customer_id,
WHERE Orders.amount >= 500; Customers.first_name, Orders.amount
FROM Customers
RIGHT JOIN Orders
ON Customers.customer_id =
Orders.customer
WHERE Orders.amount >= 500;
SQL FULL OUTER JOIN
It joins two tables based on a common column, and selects records that
have matching values in these columns and remaining rows from both of
the tables.
Syntax of FULL OUTER JOIN FULL OUTER JOIN With WHERE Clause
The SQL command can have an optional WHERE
clause with the FULL OUTER JOIN statement.
SELECT columns For example,
FROM table1
FULL OUTER JOIN table2 SELECT Customers.customer_id,
ON table1.column_name = Customers.first_name, Orders.amount
table2.column_name; FROM Customers
FULL OUTER JOIN Orders
ON Customers.customer_id =
Orders.customer
WHERE Orders.amount >= 500;
Aggregate Functions in SQL
1. Find the sum of the salaries of all employees, the maximum salary, the
minimum salary, and the average salary.
SELECT SUM (Salary), MAX (Salary), MIN (Salary), AVG (Salary)
FROM EMPLOYEE;
2. Find the sum of the salaries of all employees of the ‘Research’ department, as
well as the maximum salary, the minimum salary, and the average salary in this
department.
SELECT SUM (Salary), MAX (Salary), MIN (Salary), AVG (Salary)
FROM (EMPLOYEE JOIN DEPARTMENT ON Dno=Dnumber)
WHERE Dname=‘Research’;
Each group (partition) will consist of the tuples that have the same value of some
attribute(s), called the grouping attribute(s).
SQL has a GROUP BY clause for this purpose. The GROUP BY clause specifies the
grouping attributes, which should also appear in the SELECT clause, so that the value
resulting from applying each aggregate function to a group of tuples appears along
with the value of the grouping attribute(s).
Example: For each department, retrieve the department number, the number of
employees in the department, and their average salary.
SELECT Dno, COUNT (*), AVG (Salary)
FROM EMPLOYEE
GROUP BY Dno;
HAVING provides a condition on the summary information
regarding the group of tuples associated with each value of the
grouping attributes. Only the groups that satisfy the condition are
retrieved in the result of the query.
Example: For each project on which more than two employees work,
retrieve the project number, the project name, and the number of
employees who work on the project.
For example, referring to the COMPANY database, we may frequently issue queries that
retrieve the employee name and the project names that the employee works on. Rather
than having to specify the join of the three tables EMPLOYEE,WORKS_ON, and
PROJECT every time we issue this query, we can define a view that is specified as the
result of these joins. Then we can issue queries on the view, which are specified as single
table retrievals rather than as retrievals involving two joins on three tables. We call the
EMPLOYEE,WORKS_ON, and PROJECT tables the defining tables of the view.
Specification of Views in SQL
In SQL, the command to specify a view is CREATE VIEW. The view is given a (virtual)
table name (or view name), a list of attribute names, and a query to specify the contents
of the view
Example 1:
CREATE VIEW WORKS_ON1
AS SELECT Fname, Lname, Pname, Hours
FROM EMPLOYEE, PROJECT, WORKS_ON
WHERE Ssn=Essn AND Pno=Pnumber;
We can now specify SQL queries on a view—or virtual table—in the same way we specify
queries involving base tables. For example, to retrieve the last name and first name of all
employees who work on the ‘ProductX’ project, we can utilize the WORKS_ON1 view and
specify the query as :
SELECT Fname, Lname
FROM WORKS_ON1
WHERE Pname=‘ProductX’;
View Implementation, View Update and Inline Views
The problem of efficiently implementing a view for querying is complex. Two main
approaches have been suggested.
• One strategy, called query modification, involves modifying or transforming the
view query (submitted by the user) into a query on the underlying base tables. For
example, the query
SELECT Fname, Lname
FROM WORKS_ON1
WHERE Pname=‘ProductX’;
The disadvantage of this approach is that it is inefficient for views defined via complex
queries that are time-consuming to execute, especially if multiple queries are going to
be applied to the same view within a short period of time.
The second strategy, called view materialization, involves physically creating a
temporary view table when the view is first queried and keeping that table on the
assumption that other queries on the view will follow. In this case, an efficient strategy for
automatically updating the view table when the base tables are updated must be
developed in order to keep the view up-to-date.
Techniques using the concept of incremental update have been developed for this
purpose, where the DBMS can determine what new tuples must be inserted, deleted, or
modified in a materialized view table when a database update is applied to one of the
defining base tables.
Updating of views
Updation of view is complicated and can be ambiguous. In general, an update on a view
defined on a single table without any aggregate functions can be mapped to an update on the
underlying base table under certain conditions. For a view involving joins, an update operation may
be mapped to update operations on the underlying base relations in multiple ways. Hence, it is often
not possible for the DBMS to determine which of the updates is intended.
To illustrate potential problems with updating a view defined on multiple tables,
consider the WORKS_ON1 view, and suppose that we issue the command to update
the PNAME attribute of ‘John Smith’ from ‘ProductX’ to ‘ProductY’.
UV1:
UPDATE WORKS_ON1
SET Pname = ‘ProductY’
WHERE Lname=‘Smith’ AND Fname=‘John’
AND Pname=‘ProductX’;
Syntax
The syntax for an inline view is,
Query 1
CREATE TABLE User_Higher_Than_200 In the code, we introduced a
SELECT User_ID, SUM(Score) FROM User_Score temporary table,
GROUP BY User_ID User_Higher_Than_200, to
HAVING SUM(Score) > 200; store the list of users who
scored higher than 200.
Query 2 User_Higher_Than_200 is
SELECT a2.ZIP_CODE, COUNT(a1.User_ID) then used to join to the
FROM User_Higher_Than_200 a1, User_Address a2 User_Address table to get the
WHERE a1.User_ID = a2.User_ID final result.
GROUP BY a2.ZIP_CODE;
We can simplify the above SQL using the inline view construct as follows:
Query 3
The code that is in red represents an inline view. There are two advantages on
using inline view here:
1. We do not need to create the temporary table. This prevents the database from
having too many objects, which is a good thing as each additional object in the
database costs resources to manage.
General form :
CREATE ASSERTION <Name_of_assertion> CHECK (<cond>)
1. Write an assertion to specify the constraint that the Sum of loans taken by a customer
does not exceed 100,000
General form:
CREATE TRIGGER <name>
BEFORE | AFTER | <events>
FOR EACH ROW |FOR EACH
STATEMENT
WHEN (<condition>)
<action>
A trigger has three components
2. Condition (optional): If the condition is true, the trigger executes, otherwise skipped
3. Action: The actions performed by the trigger
When the Event occurs and Condition is true, execute the Action
Assertions vs. Triggers
Assertions do not modify the data, they only check certain conditions. Triggers are more
powerful because the can check conditions and also modify the data
Assertions are not linked to specific tables in the database and not linked to specific events.
Triggers are linked to specific tables and specific events
All assertions can be implemented as triggers (one or more). Not all triggers can be
implemented as assertions
Chapter 2: Transaction Processing
44
Chapter 2: Transaction Processing
Transaction:
A transaction is a program including a collection of database operations, executed as a
logical unit of data processing. The operations performed in a transaction include one or
more of database operations like insert, delete, update or retrieve data. It is an atomic
process that is either performed into completion entirely or is not performed at all. Each
high level operation can be divided into a number of low level tasks or operations. For
example, a data update operation can be divided into three tasks −
read_item() − reads data item from storage to main memory.
modify_item() − change value of item in the main memory.
write_item() − write the modified value from main memory to storage.
The Figure , shows two processes, A
and B, executing concurrently in an
interleaved fashion
Interleaving keeps the CPU busy when a
process requires an input or output (I/O)
operation, such as reading a block from disk
The CPU is switched to execute another
process rather than remaining idle during I/O
time
45
Basic DB access operations that a transaction can include are:
46
Question:
Why concurrency control and recovery are needed in DBMS?
(or)
List & Explain the types of problem that may occur when 2 transactions
run concurrently.
(or)
What are the anomalies that can occur due to interleaved execution?
(or)
What are the types of problems that may occur when 2 transactions run
concurrently?
47
Why Concurrency Control Is Needed
For example:
X = 80 at the start (there were 80 reservations on the flight)
N = 5 (T1 transfers 5 seat reservations from the flight corresponding to X to the flight corresponding to Y)
M = 4 (T2 reserves 4 seats on X) The final result should be X = 79.
The interleaving of operations shown in Figure is X = 84 because the update in T1 that removed
the five seats from X was lost.
49
2.The Temporary Update / Dirty Read Problem [WR conflict]
occurs when one transaction updates a database item and then the transaction fails for some
reason before doing commit.
Meanwhile the updated item is accessed by another transaction before it is changed back to
its original value
50
3.The Incorrect Summary Problem
•If one transaction is calculating an aggregate summary function on a number of DB items while
other transactions are updating some of these items, the aggregate function may calculate some
values before they are updated and others after they are updated.
51
4.The Unrepeatable Read Problem [RW conflict]
Transaction T reads the same item twice and gets different values on
each read, since the item was modified by another transaction T` between the
two reads.
for example, if during an airline reservation transaction, a customer inquires about
seat availability on several flights
When the customer decides on a particular flight, the transaction then reads the
number of seats on that flight a second time before completing the reservation, and it may end
up reading a different value for the item.
Why Recovery Is Needed
Whenever a transaction is submitted to a DBMS for execution, the system is
responsible for making sure that either:
• All the operations in the transaction are completed successfully and their effect is recorded
permanently in the database ( The transaction is committed) or
• The transaction does not have any effect on the database or any other transactions
In the first case, the transaction is said to be committed, whereas in the second case, the
transaction is aborted
If a transaction fails after executing some of its operations but before executing all of them,
the operations already executed must be undone and have no lasting effect.
52
Question:
53
Transaction States and Operations
A transaction is an atomic unit of work that should either be completed in its entirety
or not done at all. For recovery purposes, the system keeps track of start of a
transaction, termination, commit or aborts.
54
Figure: State transition diagram illustrating the states for transaction execution
A transaction goes into active state immediately after it starts execution and can execute read and write
operations.
When the transaction ends it moves to partially committed state.
At this end additional checks are done to see if the transaction can be committed or not. If these checks are
successful the transaction is said to have reached commit point and enters committed state. All the changes are
recorded permanently in the db.
A transaction can go to the failed state if one of the checks fails or if the transaction is aborted during its active
state. The transaction may then have to be rolled back to undo the effect of its write operation.
Terminated state corresponds to the transaction leaving the system. All the information about the transaction is
removed from system tables.
55
Question:
Explain ACID properties/ desirable properties of Transaction
56
Desirable Properties of Transactions (ACID Properties)
Transactions should possess several properties, often called the ACID properties
A Atomicity:
a transaction is an atomic unit of processing and it is either performed entirely or not at all.
C Consistency Preservation:
a transaction should be consistency preserving that is it must take the database from one consistent
state to another.
I Isolation/Independence:
A transaction should appear as though it is being executed in isolation from other transactions, even
though many transactions are executed concurrently.
D Durability (or Permanency):
if a transaction changes the database and is committed, the changes must never be lost because of
any failure.
The atomicity property requires that we execute a transaction to completion. It is the responsibility of
the transaction recovery subsystem of a DBMS to ensure atomicity.
The preservation of consistency is generally considered to be the responsibility of the programmers
who write the database programs or of the DBMS module that enforces integrity constraints.
The isolation property is enforced by the concurrency control subsystem of the DBMS. If every
transaction does not make its updates (write operations) visible to other transactions until it is
committed, one form of isolation is enforced that solves the temporary update problem and eliminates
cascading rollbacks
Durability is the responsibility of recovery subsystem.
57
Question:
Write a short note on System log
58
The System Log
The system log that is generally written on stable storage contains the
redundant data required to recover from volatile storage failures and as well from
errors discovered by the transaction or the database system. System log is as well
known as log and has sometimes been called the DBMS journal. It consists of the
following entries (also known as log records):
c. [read_item, T, X]: Points out that transaction T has read the value of database item X.
d. [commit, T]: Points out that transaction T has completed successfully, and affirms that
its effect can be committed (recorded permanently) to the database.
60
Consider the schedule Sa given below, which is the same as schedule Sa except that two
commit operations have been added to Sa:
Sa is recoverable, even though it suffers from the lost update problem; this problem is
handled by serializability theory (see Section 21.5). However, consider the two (partial)
schedules Sc and Sd that follow:
Sd: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); c1; c2; Se: r1(X); w1(X); r2(X); r1(Y);
w2(X); w1(Y); a1; a2;
Sc is not recoverable because T2 reads item X from T1, but T2 commits before T1
commits. The problem occurs if T1 aborts after the c2 operation in Sc, then the value of X
that T2 read is no longer valid and T2 must be aborted after it is committed, leading to a
schedule that is not recoverable. For the schedule to be recoverable, the c2 operation in
Sc must be postponed until after T1 commits, as shown in Sd.
61
Recoverable Schedule
Example:
r2(X) r1(X) w1(X) w2(X) a1
Serializable schedule:
– A schedule S is serializable if it is equivalent to some serial schedule of the
same n transactions.
Result equivalent:
– Two schedules are called result equivalent if they produce the same final state
of the database.
Conflict equivalent:
– Two schedules are said to be conflict equivalent if the order of any two
conflicting operations is the same in both schedules.
Conflict serializable:
– A schedule S is said to be conflict serializable if it is conflict equivalent to some
serial schedule S’.
67
Question:
Problems related to Testing conflict serializability of a Schedule using
precedence graph
68
Testing conflict serializability of a Schedule S
69
70
(a)Precedence graph for serial schedule A.
(b)Precedence graph for serial schedule B.
(c)Precedence graph for schedule C (not
serializable).
Fig: Constructing the precedence graphs for (d)Precedence graph for schedule D
schedules A and D from fig 21.5 to test for conflict (serializable, equivalent to schedule A).
serializability.
71
Example of serializability testing. (a) The READ and WRITE operations
of three transactions T1, T2, and T3.
72
Draw Precedence graph for schedule E
73
Precedence graph for schedule E
74
Precedence graph for schedule F
75
Precedence graph for schedule F
76
Question:
Briefly Explain Transaction support in SQL
77
Transaction Support in SQL
The basic definition of an SQL transaction is, it is a logical unit of work and is
guaranteed to be atomic
A single SQL statement is always considered to be atomic—either it completes
execution without an error or it fails and leaves the database unchanged.
Every transaction must have an explicit end statement, which is either a COMMIT or a
ROLLBACK.
78
•The isolation level
- specified using the statement ISOLATION LEVEL <isolation>, where the value
for <isolation> can be READ UNCOMMITTED, READ COMMITTED, REPEATABLE
READ, or SERIALIZABLE
- The default isolation level is SERIALIZABLE
79
The transaction consists of first inserting a new row in
the EMPLOYEE table and then updating the salary of all
employees who work in department 2
If an error occurs on any of the SQL statements, the
entire transaction is rolled back
This implies that any updated salary (by this
transaction) would be restored to its previous value and that
the newly inserted row would be removed.
80