0% found this document useful (0 votes)
4 views33 pages

Module 4 DBMS

Module 4 discusses advanced SQL queries, focusing on complex comparisons involving NULL values, nested queries, and correlated queries. It explains the use of SQL functions like EXISTS and UNIQUE, as well as the importance of join operations, including INNER and OUTER JOINs. The module also provides various query examples to illustrate these concepts in practice.

Uploaded by

h13584962
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views33 pages

Module 4 DBMS

Module 4 discusses advanced SQL queries, focusing on complex comparisons involving NULL values, nested queries, and correlated queries. It explains the use of SQL functions like EXISTS and UNIQUE, as well as the importance of join operations, including INNER and OUTER JOINs. The module also provides various query examples to illustrate these concepts in practice.

Uploaded by

h13584962
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Module 4

Chapter 1
SQL: Advanced Queries
4.1 MORE COMPLEX SQL QUERIES
1. Comparisons Involving NULL and Three-Valued Logic:
 NULL is used to represent a missing value that usually has one of the 3 different
interpretations.
 value unknown (exists but is not known or it is not known whether a value exists or not),
 value not available (exists but is purposely withheld), or o attribute not applicable
(undefined for this tuple).
Examples:
1. Unknown value: A particular person has a date of birth but it is not known, so it is represented
by NULL in the database.
2. Unavailable or withheld value: A person has a home phone but does not want it to be listed,
so it is withheld and represented as NULL in the database.
3. Not applicable attribute: An attribute LastCollegeDegree would be NULL for a person who
has no college degrees, because it does not apply to that person.
 SQL does not distinguish between the different meanings of NULL.
 In general, each NULL value is considered to be different from every other NULL value in
the database records.
 When a record with NULL is involved in a comparison operation, the result is considered to
be UNKNOWN (it may be TRUE or it may be FALSE). Hence, SQL uses a 3-valued logic
with values TRUE, FALSE and UNKNOWN.
 The results or truth values of three-valued logical expressions when the logical connectives
AND, OR and NOT are used are showed in the table below.
AND TRUE FALSE UNKNOWN

TRUE TRUE FALSE UNKNOWN

FALSE FALSE FALSE FALSE

UNKNOWN UNKNOWN FALSE UNKNOWN

OR TRUE FALSE UNKNOWN

TRUE TRUE TRUE TRUE

FALSE TRUE FALSE UNKNOWN

UNKNOWN TRUE UNKNOWN UNKNOWN

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

NOT

TRUE FALSE

FALSE TRUE

UNKNOWN UNKNOWN

In (a) and (b), the rows and columns represent the values of the results of 2 three valued
Boolean expressions which would appear in the WHERE clause of an SQL query. Each
expression result would have a value of true, false, or unknown.
In select-project-join queries, the general rule is that only those combinations of tuples that
evaluate the logical expression of the query to TRUE are selected. Tuple combinations that
evaluate to FALSE or UNKNOWN are not selected.
There are exceptions to that rule for certain operations such as outer joins.
 SQL allows queries that check whether an attribute value is NULL.
 SQL uses the comparison operators IS or IS NOT to compare an attribute value to NULL.
SQL considers each NULL value as being distinct from every other NULL value, so
equality comparison is not appropriate. It follows that when a join condition is specified,
tuples with NULL values for the join attributes are not included in the result (unless it is an
outer join).
Query 1: Retrieve the names of all employees who do not have supervisors.
SELECT Fname, Lname FROM EMPLOYEE
WHERE Super ssn IS NULL;
2. Nested Queries, Tuples, and Set / Multiset Comparisons:
 Some queries require that existing values in the database be fetched and then used in a
comparison condition.
 Nested queries are complete select-from-where blocks within another SQL query. That other
query is called the outer query. The nested queries can also appear in the WHERE clause of the
FROM clause or other SQL clauses as needed.
 The comparison operator IN compares a value v with a set (or multiset) of values V and
evaluates to TRUE if v is one of the elements in V.
Query 2: Retrieve the project numbers of all projects that either:Belong to a department
managed by an employee whose last name is 'Smith',
OR
Have at least one employee with the last name 'Smith' working on them. Display only
distinct project numbers.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

SELECT DISTINCT Pnumber


FROM PROJECT
WHERE Pnumber IN
(SELECT Pnumber
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE D n u m = D n u m b e r AND M g r s s n = S s n AND Ln a m e = ' S m i t h ' )
OR
Pnumber IN
(SELECT Pno
FROM WORKS ON, EMPLOYEE
WHERE Essn=Ssn AND Lname='Smith');

If a nested query returns a single attribute and a single tuple, the query result will be a single
value. In such cases, = can be used instead of IN for the comparison operator. In general, a
nested query will return a table (relation) which is a set or multiset of tuples.
 SQL allows the use of tuples of values in comparisons by placing them in parentheses. This
is illustrated in the query below.
Query 3: Retrieve the Ssns of all employee who work on the same (project, hours)
combination on some project that employee 'John Smith' whose Ssn is 123456789 works
on.
SELECT D I ST IN CT E s sn
FROM WORKS ON
W HE R E ( P n o , H o u r s ) I N ( SE L E C T P n o , H o u r s
FROM WORKS ON
W H E R E Ssn = ' 1 2 3 4 5 6 7 8 9 ' ) ;
In this example, the IN operator compares the sub-tuple of values in parentheses (Pno, Hours)
for each tuple in WORKS_ON with the set of union compatible tuples produced by the nested
query.
A number of comparison operators can be used to compare a single value v (typically an
attribute name) to a set or multiset V (typically a nested query). The =ANY (or =SOME)
operator returns TRUE if the value v is equal to some value in the set :V and is hence equivalent
to IN The keywords ANY and SOME have same meaning. Operators that can be combined
with ANY include >, >= < >. The keyword ALL can also be combined with each of these
operators. The comparison condition (v >ALL V) returns TRUE if value 'v is greater than all
the values in the set (or multiset) V.
Query 4: Retrieve the names of employees whose salary is greater than the salary of all
the employees in department 5.
SELECT Ln a m e , Fn a m e FROM EMPLOYEE
WHERESalary>ALL(SELECT S alary
FROM WHERE EMPLOYEE Dno = 5 ) ;

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

If attributes of the same name exist, one in the FROM clause of the nested query and one in the
FROM clause of the outer query, then there arises ambiguity in attribute names. The rule is that
the reference to an unqualified attribute refers to the relation declared in the innermost nested
query. It is generally advisable to create tuple variables (aliases) for all the tables referenced in
an SQL query to avoid errors and ambiguities.
Query 5: Retrieve the name of each employee who has a dependent with the same first
name and sex as the employee.
SELECT E . Ln a m e, E . Fn a m e
FROM EMPLOYEE AS E
WHERE E . S s n I N ( SELECT E s s n
FROM DEPENDENT WHERE
E.Fname=Dep en dentname AND E .Sex=Sex);
3. Correlated Nested Queries:
 Whenever a condition in the WHERE clause of a nested query references some attribute of a
relation declared in the outer query, the two queries are said to be correlated.
 For a correlated nested query, the nested query is evaluated once for each tuple (or
combination of tuples) in the outer query.
Example: In Query 5, for each EMPLOYEE tuple, evaluate the nested query, which
retrieves the Essn values for all DEPENDENT tuples with the same sex and name as that
EMPLOYEE tuple; if the SSN value of the EMPLOYEE tuple is in the result of the nested
query, then select that EMPLOYEE tuple.
In general, a query written with nested select-from-where blocks and using the = or IN
comparison operators can always be expressed as a single block query. The query below is
another way of solving Query 5.
SELECT E . Fn a m e, E . Ln a m e
FROM EMPLOYEE AS E, DEPENDENT AS D
WHERE E . Ssn = D. E ssn AND E . Sex= D. Sex
AN D E . F n a m e = D . D e p e n d e n t n a m e ;
4. The EXISTS and UNIQUE Functions in SQL: EXISTS and UNIQUE are Boolean functions
that return TRUE or FALSE.
 They can be used in WHERE clause condition.
 The EXISTS function in SQL is used to check whether the result of a correlated nested query
is empty (contains no tuples) or not.
 The result of EXISTS is a Boolean value TRUE if the nested query result contains atleast one
tuple or FALSE if the nested query result contains no tuples.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

EXISTS and NOT EXISTS are typically used in conjunction with a correlated nested query.
 EXISTS (Q) returns TRUE if there is atleast one tuple in the result of the nested query Q
and returns FALSE otherwise.
 NOT EXISTS ( Q) returns TRUE if there are no tuples in the result of the nested query Q
and returns FALSE otherwise.
Query 5 can be written in an alternative form that uses EXISTS. The nested query
references the Ssn, Fname, and Sex attributes of the EMPLOYEE relation from the outer
query. For each employee tuple, evaluate the nested query which retrieves all
DEPENDENT tuples with the same Essn, Sex and Dependent_name as the employee
tuple; if atleast 1 tuple EXISTS in the result of the nested query, then select that employee
tuple.
SE L E C T E . F n a m e , E . L n a m e FROM EMPLOYEE AS E WHERE EXISTS ( SELECT
* FROM DEPENDENT AS D WHE R E E . S s n = D . E s s n AND E . S e x = D . S e x AND
E.Fname=D.Dependentname);
Query 6: Retrieve the names of employees who have no dependents.
SELECT FNAME, LNAME FROM EMPLOYEE WHERE N OT EX I ST S ( SELE CT *
FROM DEPENDENT WHERE SSN=ESSN);
In this query, the correlated nested query retrieves all DEPENDENT tuples related to an
EMPLOYEE tuple. If none exist, the EMPLOYEE tuple is selected because the WHERE
clause condition will evaluate to TRUE in this case. For each employee tuple, the correlated
nested query selects all DEPENDENT tuples whose Essn value matches the EMPLOYEE Ssn;
if the result is empty, no dependents are related to the employee and so we select that employee
tuple.
Query 7: List the names of managers who have atleast one dependent.
SELECT FNAME, LNAME FROM EMPLOYEE WHEREEXISTS ( SELECT * FROM
DEPENDENT WHERE SSN=ESSN) AND EXISTS (SELECT FROM DEPARTMENT
WHERE S s n = M g r s s n ) ;
Query 8: Retrieve the name of each employee who works on all the projects controlled by
department number 5.
SELECT Fna me, Lna me FROM EMPLOYEE WHEREN OT E X I ST S ( ( S E L E C T P n
u m b e r F R OM P R O J E C T WHERE D num= S) EXCEPT (SELECT Pno FROM WORKS
ON WHERE Ssn=Essn)) ;
The first subquery (which is not correlated with the outer query) selects all projects controlled
bydepartment 5 and the second subquery (which is correlated with the outer query) selects all
projects that the particular employee being considered works on. If the set difference of the
first subquery result MINUS EXCEPT) the second subquery result is empty, it means that the
employee works on all the projects and is therefore selected.
 The function UNIQUE (Q) returns TRUE if there are no duplicate tuples in the result of query
Q; otherwise it returns FALSE

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

.  The UNIQUE function can be used to test whether the result of a nested query is a set — no
duplicates or a multiset — duplicates exist.
5. Explicit sets and Renaming of Attributes in SQL:
 An explicit set of values can be used in the where clause rather than a nested query. Such a
set is enclosed in parentheses in SQL.
Query 8 : Retrieve the Essn of all employees who work on project numbers 1, 2 or 3.
SELECT DISTINCT Essn
FROM WORKS ON WHERE Pno IN (1,2,3);
 In SQL, it is possible to rename any attribute that appears in the result of a query by adding
the AS qualifier followed by the desired new name.
 AS construct can be used to alias both attribute and relation names in general and it can be
used in appropriate parts of a query.
SELECT E.Lname AS Employee name, S.Lname AS Supervisor name
FROM EMPLOYEE AS E, EMPLOYEE AS S
WHERE E.Superssn=S.Ssn;
6. Joined Tables in SQL:
 The joined table (or joined relation) permits users to specify a table resulting from a join
operation in the FROM clause of a query.
This construct avoids mixing together all the select and join conditions in the WHERE clause.
Example: Consider query which retrieves the name and address of every employee who works
for the 'Research' department. First specify the join of the EMPLOYEE and DEPARTMENT
relations and then select the desired tuples and attributes. The FROM clause contains a single
joined table.
SELECT F n a m e , L n a m e , A d d r e s s FROM (EMPLOYEE JOIN DEPARTMENT ON
Dno=Dnumber) WHERE D n a m e = ' R e s e a r c h ' ;
The attributes of such a table are all the attributes of the first table EMPLOYEE followed by
all attributes of the second table DEPARTMENT.  In a NATURAL JOIN on two relations R
and S, no join condition is specified; an implicit EQUIJOIN condition for each pair of attributes
with the same name from R and S is created. Each such pair of attributes is included only once
in the resulting relation.
 If the names of join attributes are not the same in base relations, rename the attributes so that
they match and then apply the NATURAL JOIN. The AS construct can be used to rename a
relation and all its attributes in the FROM clause.
Example: Here the DEPARTMENT relation is renamed as DEPT and its attributes are renamed
as Dname , Dno, Mssn and Msdate. The implied join condition for this natural join is
EMPLOYEE. Dno = DEPT. Dno because it is the only pair of attributes with the same name.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

SELECT Fname, Lname, Address FROM (EMPLOYEE NATURAL JOIN DEPARTMENT


AS DEPT(Dname, Dno, Mssn, Msdate) WHERE DNAME='Research';
 The default type of join in a joined table is an INNER JOIN where a tuple is included in the
result only if a matching tuple exists in the other relation. If every tuple should be included in
the result, OUTER JOIN must be explicitly specified.
Query 9: Retrieve the names of employees along with their supervisor name and even if
employee has no supervisor include his/her name too.
SELECT E . Ln a m e A S E m p l oye e n a m e , S. Ln a m e A S Supervisor name FROM
(EMPLOYEE AS E LEFT OUTER JOIN EMPLOYEE AS S ON E.Superssn=S.Ssn);
If the user requires that all employees be included, a different type of join called OUTER
JOIN must be used explicitly.
• LEFT OUTER JOIN (every tuple in the left table must appear in the result; if it does
not have a matching tuple, it is padded with NULL values for the attributes of the right
table).
• RIGHT OUTER JOIN (every tuple in the right table must appear in the result; if it
does not have a matching tuple, it is padded with NULL values for the attributes of the
left table)
Left Outer JOIN Syntax:
SELECT column_name(s)
FROM table1
LEFT JOIN table2
ON table1.column_name = table2.column_name;
Example 1 :
SELECT Customers.CustomerName, Orders.OrderID
FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID
ORDER BY Customers.CustomerName;
Example 2:
Q2: Select E.Lname AS Employee_name, S.Lname AS supervisor_name
From (Employee As E LEFT OUTER JOIN Employee AS S ON E.superssn=S.ssn);
Right Outer JOIN Syntax:
SELECT column_name(s)
FROM table1
RIGHT JOIN table2
ON table1.column_name = table2.column_name;

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

Example 1:
SELECT Orders.OrderID, Employees.LastName, Employees.FirstName
FROM Orders
RIGHT JOIN Employees ON Orders.EmployeeID = Employees.EmployeeID
ORDER BY Orders.OrderID;
 Join specifications can be nested where one of the tables in a join may itself be a joined table.
This allows the specification of the join of three or more tables as a single joined table which
is called a multiway join.
Example: SELECT PNUMBER, DNUM, LNAME, BDATE, ADDRESS
FROM((PROJECT JOIN DEPARTMENT ON DNUM=DNUMBER) JOIN
EMPLOYEE ON MGRSSN=SSN) WHERE PLOCATION='Stafford';
 Some SQL implementations have a different syntax to specify outer joins by using the
comparison operators += for left outer join, =+ for right outer join and +=+ for full outer join
when specifying the join condition.(Eg: Oracle uses this syntax)
Example: SELECT E .Lname, S.Lname
FROM EMPLOYEE E, EMPLOYEE S W H E R E
E.Superssn+=S.Ssn;
7. Aggregate Functions in SQL:
 Aggregate functions are used to summarize information from multiple tuples into a single-
tuple summary.
 Grouping is used to create subgroups of tuples before summarization.
 SQL has built-in aggregate functions - COUNT, SUM, MAX, MIN and AVG.
 The COUNT function returns the number of tuples or values as specified in a query.
  The functions SUM, MAX, MIN and AVG are applied to a set or multiset of numeric values
and return the sum, the maximum value, the minimum value and the average of those values
respectively.
 These functions can be used in the SELECT clause or in a HAVING clause.
 The functions MAX and MIN can also be used with attributes that have nonnumeric domains
if the domain values have a total ordering among one another.
 NULL values are discarded when aggregate functions are applied to a particular column
(attribute). COUNT (*) counts tuples not values hence NULL values do not affect it.
 When an aggregate function is applied to a collection of values, NULLs are removed from
the collection before the calculation. If the collection becomes empty because all values are
NULL,

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

the aggregate function will return NULL except COUNT which returns a 0 for an empty
collection of values.
 Aggregate functions can also be used in selection conditions involving nested queries. A
correlated nested query with an aggregate function can be specified and then used in the
WHERE clause of an outer query.
 SQL also has aggregate functions SOME and ALL that can be applied to a collection of
Boolean values. SOME returns TRUE if atleast one element in the collection is TRUE whereas
ALL returns TRUE if all elements in the collection are TRUE.
Query 10: Find the sum of salaries, maximum salary, the minimum salary, and the
average salary among all employees.
SELECT SUM(SALARY), MAX(SALARY), MIN(SALARY), AVG(SALARY)
FROM EMPLOYEE;
 This query returns a single-row summary of all the rows in the EMPLOYEE table. We can
use the keyword AS to rename the column names in the resulting single-row table.
SELECT SUM(SALARY) AS Total_salary, MAX(SALARY) AS Highest _salary,
MIN(SALARY) AS Lowest salary, AVG(SALARY)AS Average salary FROM EMPLOYEE;
Query11: Find the sum of the salaries, maximum salary, the minimum salary, and the
average salary among employees who work for the 'Research' department.
SELECT SUM (SALARY) , MAX(SALARY), MIN( SALARY), AVG ( SALARY) FROM
EMPLOYEE, DEPARTMENT WHERE DNO=DNUMBER AND DNAME='Research';
Query12: Retrieve the total number of employees in the company.
SE L E C T C OU N T (* ) FROM EMPLOYEE;
Query13: Retrieve the total number of employees in the 'Research' department.
SELECT COUNT (*) FROM EMPLOYEE, DEPARTMENT WHEREDNO=DNUMBER
AND DNAME='Research';
Here the asterisk (*) refers to the rows (ttiples), so COUNT (*) returns the number of rows in
the result of the query. The COUNT function can also be used to count values in a column
rather than tuples.
Query14 : Count the number of distinct salary values in the database.
SELECT CO UNT ( DISTINCT S a l a r y) FROM EMPLOYEE; COUNT (Salary) will not
eliminate duplicate values of Salary. Any tuples with NULL for Salary will not be counted.
Query 15: Retrieve the names of all employees who have two or more dependents.
SELE CT L n a m e , F n a m e FROM EMPLOYEE WHERE ( SELECT COUNT (*) FROM
DE PE NDE NT WH ERE S s n = E s s n ) > = 2 ;
The correlated nested query counts the number of dependents that each employee has. If the
count is greater than or equal to two, the employee tuple is selected.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

8. Grouping: The GROUP BY and HAVING Clauses:


 The aggregate functions can be applied to subgroups of tuples in a relation where the
subgroups are based on some attribute values. For e.g., to find the average salary of employees
in each department, we need to partition the relation into non-overlapping subsets (or groups)
of tuples.
 Each group (or partition) will consist of tuples that have the same value of some attribute(s)
called the grouping attribute(s).The function is then applied to each subgroup independently to
produce summary information about each group.
 SQL has a GROUP BY-clause for specifying the grouping attributes. These attributes must
also appear in the SELECT- clause so that the value resulting from applying each aggregate
function to a group of tuples appears along with the value of the grouping attribute(s).
 If NULLs exist in the grouping attribute, then a separate group is created for all tuples with
a NULL value in the grouping attribute. Eg: If the EMPLOYEE tuple had NULL for the
grouping attribute Dno, there would be a separate group for those tuples in the result of Query
16.
 A join condition can be used in conjunction with grouping.
 To retrieve the values of aggregate functions for only those groups that satis.b, certain
conditions, SQL provides a HAVING clause which can appear in conjunction with a GROUP
BY clause. The HAVING clause is used for specifying a selection condition on groups (rather
than on individual tuples) of tuples associated with each value of the grouping attributes.
HAVING provides a condition on the summary information regarding the group of tuples
associated with each value of the grouping attributes. Only the groups that satisfy the condition
are retrieved in the result of the query.
• The selection conditions in the WHERE clause limit the tuples to which functions are applied
but the HAVING clause serves to choose whole groups.
Query 16: For each department, retrieve the department number, the number of
employees in the department, and their average salary.
SELECT D n o , C O U N T ( * ) , A V G ( S a l a r y )
FROM EMPLOYEE GROUP B Y D n o ;
The EMPLOYEE tuples are divided into groups - each group having the same value for the
grouping attribute Dno. The COUNT and AVG functions are applied to each such group of
tuples separately. The SELECT clause includes only the grouping attribute and the functions
to be applied on each group of tuples.
Query 17: For each project, retrieve the project number, project name, and the number
of employees who work on that project.
SELECT P n um be r , Pn a m e, COUNT (*FROM PROJECT, WORKS ON WHERE
PNUMBER=PNO GROUP BY P n u m b e r , P n a m e ;
In this case, the grouping and functions are applied after the joining of the two relations.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

Query 18 : For each project on which more than two employees work, retrieve the project
number, project name, and the number of employees who work on that project.
SELECT P n u m b e r , P n a m e , C OU N T (* ) FROM PROJECT, WORKS ON WHERE
Pn u m ber = Pn o GR OU P BY P n u m b e r , P n a m e HAVING C OU N T (*) > 2 ;
Query 19: For each project, retrieve the project number, the project name and the
number of employees from department 5 who work on the project.
SELECT P n u m b e r , P n a m e , C OU N T (* ) FROM PROJECT, WORKS ON, EMPLOYEE
WHERE P n um be r = Pn o A N D S sn = E s sn A N D Dn o= 5 GROUP BY P n u m b e r , P
name;
Query 20: For each department that has more than 5 employees, retrieve the department
number and the number of employees who are making a salary more than $40,000.
SELECT Dno, COUNT(*)
FROM EMPLOYEE WHERE S a l a r y > 4 0 0 0 0 A N D D n o IN (SELECT D n o FROM
EMPLOYEE GROUP BY Dno HAVING COUNT(*)>5) GROUP BY Dno;
9. SQL Constructs: WITH and CASE
 The WITH clause allows a user to define a table that will only be used in a particular query.
This table will be dropped after its use in that query.
 Queries using WITH can generally be written using other SQL constructs.
Example: WITH LARGE_DEPTS (Dno) AS (SELECT Dno FROM EMPLOYEE SELECT
GROUP BY Dno HAVING COUNT(*) > 5) Dno, COUNT(*) FROM WHERE EMPLOYEE
S a l a r y> 4 00 0 0 AND D no IN LARGE DEPTS GROUP BY D n o ;
Here a temporary table LARGE_DEPTS is defined using the WITH clause whose result holds
the Dnos of departments with more than 5 employees. This table is then used in the subsequent
query. Once this query is executed the temporary table LARGE_DEPTS is discarded.
 The SQL CASE construct can be used when a value can be different based on certain
conditions.
 It can be used in any part of an SQL query where a value is expected, including when
querying, inserting or updating tuples.
Example: Suppose we want to give employees different raise amounts depending on which
department they work for. Employees in department 5 get a $2000 raise, those in department 4
get $1500 and those in department 1 get $3000. We can write the update operation as:
UPDATE EMPLOYEE SET Salar y = CASE WHEN D n o = 5 THEN S a l a r y + 2 0 0 0
WHEN D no = 4 THEN S a l a r y + 1500 WHEN D no = 1 THEN S a l a r y + 3 0 0 0 ELSE
S a l a r y+ 0 ;
Here the salary raise value is determined through the CASE construct based on the department
number for which each employee works.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

 The CASE construct can also be used when inserting tuples that can have different attributes
being NULL depending on the type of record being inserted into a table, as when a
specialization is mapped into a single table or when a union type is mapped into a relation.
10. Recursive Queries in SQL
 A recursive query can be written in SQL using WITH RECURSIVE construct. It allows users
the capability to specify a recursive query in a declarative manner.
 A recursive relationship between tuples of the same type is the recursive relationship between
an employee and supervisor. This relationship is described by the foreign key S up r_s sn of
the EMPLOYEE relation.
An example of a recursive operation is to retrieve all supervisees of a supervisor employee
e at all levels — all employees e' directly supervised by e, all employees e' directly
supervised by each employee e' , all employees e" ' directly supervised by each employee
e' and so on.
WITH RECURSIVE SUP_EMP ( S u p s sn , E m p ssn ) AS ( SELECT S u p e r s n , S s n
FROM EMPLOYEE UNION S E L E C T E . S s n , S . S u p s s n FROM E MPL OYE E AS
E , S UP E M P AS S WHEREE . S u p e r s s n = S . E m p s s n ) SELECT * FROM SUP_EMP;
Here the view SUP_EMP will hold the result of the recursive query. The view is initially empty.
It is first loaded with the first level (Supervisor, supervisee) Ssn combinations through the first
part (SELECT Super ssn, Ssn FROM EMPLOYEE) which is called the base query. This will
be combined via UNION with each successive level of supervisees through the second part,
where the view contents are joined again with the base values to get the second level
combinations which are UNIONed with the first level. This is repeated with successive levels
until a fixed pint is reached where no more tuples are added to the view. At this point the result
of the recursive query is in the view SUP EMP.
Select SQL Statement: A query in SQL can consist of up to six clauses, but only the first two
SELECT and FROM, are mandatory. The clauses are specified in the following order:

 The select clause lists the attributes or functions to be retrieved.


 The FROM clause specifies all the relations or tables needed in the query including joined
relations but not those in nested queries.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

 The WHERE clause specifies the conditions for selection of tuples from these relations
including join conditions if needed.
 GROUP BY specifies grouping attributes whereas HAVING specifies a condition on the
groups being selected rather than on the individual tuples.
 The built in aggregate functions COUNT, SUM, AVG, MIN and MAX are used in conjunction
with grouping but they can also be applied to all the selected tuples in a query without the group
by clause.
 :* ORDER BY specifies an order for displaying the result of a query. It is applied at the end
to sort the query result.
 A query is evaluated conceptually by first applying the FROM clause followed by the
WHERE clause and then by the GROUP BY and HAVING.
1.For each department, retrieve the department number, the number of employees in the
department, and their average salary
SELECT Dno, COUNT (*), AVG (Salary) FROM EMPLOYEE GROUP BY Dno;
2.For each project, retrieve the project number, the project name, and the number of
employees who work on that project.
SELECT Pnumber, Pname, COUNT (*) FROM PROJECT, WORKS_ON WHERE Pnumber
= Pno GROUP BY Pnumber, Pname;

The HAVING clause was added to SQL because the WHERE keyword cannot be used with
aggregate functions. (Filter groups on aggregate values)
1. For each project on which more than two employees work, retrieve the project number,
the project name, and the number of employees who work on the project
SELECT Pnumber, Pname, COUNT (*) FROM PROJECT, WORKS_ON WHERE Pnumber
= Pno GROUP BY Pnumber, Pname HAVING COUNT (*) > 2;

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

4.2 SPECIFYING CONSTRAINTS AS ASSERTIONS AND ACTIONS AS TRIGGERS


The CREATE ASSERTION can be used to specify additional types of constraints that are
outside the scope of the built-in relational model constraints (primary and unique keys, entity
integrity and referential integrity). These built-in constraints can be specified in CREATE
TABLE statement of SQL.
The CREATE TRIGGER can be used to specify automatic actions that the database systems
will perform when certain events and conditions occur. This type of functionality is referred to
as active databases.
1. Specifying General Constraints as Assertions in SQL:
In SQL, users can specify general constraints via declarative assertions, using the CREATE
ASSERTION statement of the DDL.  Each assertion is given a constraint name and is specified
via a condition similar to the WHERE clause of an SQL query.
For example, to specify the constraint that "the salary of an employee must not be greater than
the salary of the manager of the department that the employee works for" in SQL, we can write
the following assertion:
CREATE ASSERTION SALARY CONSTRAINT CHECK (NOT EXISTS(SELECT * FROM
EMPLOYEE E, EMPLOYEE M, DEPARTMENT D WHERE E.SALARY > M.SALARY
AND E.DNO = D.DNUMBER AND D.MGRSSN = M.SSN));
• Assertion Name: SALARY_CONSTRAINT is the name given to this assertion.
• Assertion Condition:
• NOT EXISTS (...): This is the core of the assertion condition. It checks for the non-
existence of any rows that satisfy the following conditions:
• SELECT * FROM EMPLOYEE E, EMPLOYEE M, DEPARTMENT D: This part
of the query performs a Cartesian product (implicit join) across the EMPLOYEE,
EMPLOYEE, and DEPARTMENT tables. This means it will consider every possible
combination of rows from these tables.
• WHERE E.SALARY > M.SALARY: Specifies that we are interested in cases where
an employee E has a salary greater than another employee M.
• AND E.DNO = D.NUMBER: Ensures that employee E belongs to a department D.
• AND D.MGRSSN = M.SSN: Specifies that the department D is managed by employee
M.
2. Trigger in SQL:
 The CREATE TRIGGER statement is used to specify the type of action to be taken when
certain events occur and when certain conditions are satisfied. For e.g., it may be useful to
specify a condition that, if violated, causes some user to be informed of the violation. A
manager may want to be informed if an employee's travel expenses exceed a certain limit by
receiving a message whenever this occurs. The action that the DBMS must take in this case is
to send an appropriate message to that user. The condition is thus used to monitor the database.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

Other actions may be specified, such as executing a specific stored procedure or triggering
other updates.
Example: Suppose we want to check whenever an employee's salary is greater than the salary
of his or her direct supervisor in the COMPANY database. Several events can trigger this rule:
inserting a new employee record, changing an employee's salary, or changing an employee's
supervisor. Suppose that the action to take would be to call an external stored procedure
SALARYLVIOLAT ION, which will notify the supervisor. The trigger could then be written
as below.
CREATE TRIGGER SALARYLVIOLATION BEFORE INSERT OR UPDATE OF SALARY,
SUPERVISOR SSN ON EMPLOYEE FOR EACH ROW WHEN (NEW. SALARY >
(SELECT SALARY FROM EMPLOYEE WHERE SSN = NEW.SUPERVISOR SSN))
INFORM SUPERVISOR(NEW.Supervisor ssn, NEW.Ssn );
The trigger is given the name SALARY VIOLATION, which can be used to remove or
deactivate the trigger later.
 A typical trigger which is regarded as an ECA (Event, Condition, Action) rule has three
components:
 The event(s): These are usually database update operations that are explicitly applied to the
database. The person who writes the trigger must make sure that all possible events are
accounted for. In some cases, it may be necessary to write more than> one trigger to cover all
possible cases. These events are specified after the keyword BEFORE, which means that the
trigger should be executed before the triggering operation is executed. An alternative is to use
the keyword AFTER, which specifies that the trigger should be executed after the operation
specified in the event is completed.
1. The condition that determines whether the rule action should be executed: Once the
triggering event has occurred, an optional condition may be evaluated. If no condition is
specified, the action will be executed once the event occurs. If a condition is specified, it is
first evaluated, and only if it evaluates to true will the rule action be executed. The condition
is specified in the WHEN clause of the trigger.
2. The action to be taken: The action is usually a sequence of SQL statements, but it could
also be a database transaction or an external program that will be automatically executed.
In this example, the action is to execute the stored procedure INFORM SUPERVISOR.
 Triggers can be used in various applications, such as maintaining database consistency,
monitoring database updates, and updating derived data automatically.
 A trigger specifies an event, a condition and an action. The action is to be executed
automatically if the condition is satisfied when the event occurs.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

4.3 VIEWS (VIRTUAL TABLES) IN SQL

1. Concept of a View in SQL:


 A view in SQL is a single table that is derived from other tables which could be base tables
or previously defined views.
 • A view does not necessarily exist in physical form; it is considered a virtual table, in contrast
to base tables, whose tuples are always physically stored in the database. This limits the
possible update operations that can be applied to views, but it does not provide any limitations
on querying a view.
 A view is a way of specifying a table that we need to reference frequently, even though it may
not exist physically.
 Queries can be specified on a view which is specified as single table retrievals.

2. Specification of Views in SQL:


 A view is specified by the SQL command CREATE VIEW.
 The view is given a (virtual) table name (or view name), a list of attribute names, and a query
to specify the contents of the view.
 If none of the view attributes results from applying functions or arithmetic operations,
attribute names for the view need not be specified, since they would be the same as the names
of the attributes of the defining tables in the default case.
 The view WORKS ON VIEW does not have new attribute names as it inherits the names of
the view attributes from the defining tables EMPLOYEE, PROJECT and WORKS ON.

 The view DEPT INFO explicitly specifies new attribute names using a one to one
correspondence between the attributes specified in the CREATE VIEW clause and those
specified in the SELECT clause of the query that defines the view.

Example: To retrieve the last name and first name of all employees who work on `ProductX'
project.
QV: SELECT Fname, Lname FROM WORKS ONI WHERE Pname=' ProductX' ;

Advantages of view:
 It simplifies the specification of certain queries.
 It is also used as a security and authorization mechanism.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

 A view should always be up- to date i.e., if we modify the tuples in the base tables on which
the view is defined, the view must automatically reflect these changes. Hence a view is not
realized at the time of view definition but when we specify a query on the view. + It is the
responsibility of the DBMS to ensure that a view is up-to-date and not of the user to ensure
that the view is up-to-date.  If a view is not needed, it can be removed by DROP VIEW
command. Eg: DROP VIEW WORKS ON VIEW;
3. View Implementation and View Update:
 Two main approaches have been suggested to know how efficiently DBMS implements a
view for efficient querying.
 The strategy of query modification involves modifying or transforming the view query into
a query on the underlying base tables.
Example: The query QV would automatically be modified to the following query by the
DBMS.
SELECT FNAME, LNAME FROM EMPLOUEE, PROJECT, WORKS ON WHERE
SSN=ESSN AND PNO=PNUMBER AND PNAME='ProjectX';
 The disadvantage of this approach is that it is inefficient for views defined via complex
queries that are time consuming to execute, especially if multiple queries are applied to the
view within a short time.
 The other strategy, view materialization involves physically creating a temporary view table
when the view is first queried and keeping that table on the assumption that other queries on
the view will follow.
Here, an efficient strategy to automatically update the view when the base tables are updated
must be developed to keep the view up- to- date. Incremental update has been developed to
determine what new tuples must be inserted, deleted or modified in a materialized view table
when a change is applied to one of the defining base tables. The view is generally kept as a
materialized (physically stored) table as long as it is being queried. If the view is not queried
for a certain period of time, the system may then automatically remove the physical table and
recomputed from scratch when future queries reference the view.
 Different strategies as to when a materialized view is updated are possible.
 immediate update strategy updates a view as soon as the base tables are changed.  lazy
update strategy updates the view when needed by a view query.
 periodic update strategy updates the view periodically (in the latter strategy, a view query
may get a result that is not up-to-date). + A retrieval query against any view can always be
issued. But issuing an INSERT, DELETE, or UPDATE command on a view table is in many
cases not possible.
 In general, an update on a view defined on a single table without any aggregate functions can
be mapped to an update on the underlying base table under certain conditions.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

 For a view involving joins, an update operation may be mapped to update operations on the
underlying base relations in multiple ways. Hence, it is not possible for the DBMS to determine
which of the updates is intended.
Example: Suppose that we issue the command to update the Pname attribute of 'John Smith'
from ProductX' to 'Productr in the view WORKS_ONVIEW.
This view update is shown in
UV1: UV': UPDATE WORKS ON1 SET P n a m e = ' Pr od u c t Y ' WHERE L n a m e = ' S
m i t h ' AND F n a m e = ' J o h n ' A N D P n a m e = ' P r o d u c t X ' ;
This query can be mapped into several updates on the base relations to give the desired update
effect on the view. Some of these updates will create additional side effects that affect the result
of other queries. Two possible updates, (a) and (b), on the base relations corresponding to UVI
are shown. Update (a) relates 'John Smith' to the 'ProductY' PROJECT tuple in place of the '
ProductX' PROJECT tuple and is the most likely desired update.

Update (b) would also give the desired update effect on the view, but it accomplishes this by
changing the name of the ' ProductX' tuple in the PROJECT relation to ' Product Y'. It is quite
unlikely that the user who specified the view update UVI wants the update to be interpreted as
in (b), since it also has the side effect of changing all the view tuples with Pname ' ProductX'.
 Some view updates may not make much sense. For example, modifying the Total_Sal
attribute of the DEPT_INFO view does not make sense because Total Sal is defined to be the
sum of the individual employee salaries. This request is shown as UV2:

A view update is feasible when only one possible update on the base relations can accomplish
the desired update effect on the view.
 Whenever an update on the view can be mapped to more than one update on the underlying
base relations, it is usually not permitted.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

 A view with a single defining table is updatable if the view attributes contain the primary key
of the base relation, as well as all attributes with the NOT NULL constraint that do not have
default values specified.
 Views defined on multiple tables using joins are generally not updatable.
 Views defined using grouping and aggregate functions are not updatable.
 In SQL, the clause WITH CHECK OPTION should be added at the end of the view definition
if a view is to be updated by INSERT, DELETE, or UPDATE statements. This allows the
system to reject operations that violate the SQL rules for view updates.
 It is also possible to define a view table in the FROM clause of an SQL query. This is known
as an in- line view.
4. Views as Authorization Mechanisms:
 Views can be used to hide certain attributes or tuples from unauthorized users.

 Suppose a certain user is only allowed to see employee information for employees who work
for department 5;
then we can create the following view DEPTEMP and grant the user the privilege to query the
view but not the base table EMPLOYEE itself.
This user will only be able to retrieve employee information for employee tuples whose Dno =
5, and will not be able to see other employee tuples when the view is queried.
CREATE VIEW DEPTEMP AS SELECT * FROM EMPLOYEE WHERE Dn o = 5 ;
A view can restrict a user to only see certain columns; for example, only the first name, last
name, and address of an employee may be visible as follows:
CREATE VIEW BASIC _EMP DATA AS SELECT F n a m e , L n a m e , A d d r e s s FROM
EMPLOYEE;
 By creating an appropriate view and granting certain users access to the view and not the base
tables, they would be restricted to retrieving only the data specified in the view.
4.4 SCHEMA CHANGE STATEMENTS IN SQL
1.Alter Schema:
Alter is generally used to change the contents related to a table in SQL. In case of SQL
Server, alter_schema is used to transfer the securables/contents from one schema to another
within a same database.
Syntax:ALTER SCHEMA current_schema_name TRANSFER TO new_schema_name;
• target_schema name is the name of the schema in which the object/contents should be
transferred.
• TRANSFER is a keyword that transfers the contents from one schema to the other.
• entity _type is the contents or kind of objects that are to be transferred.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

• securable_name is the name of the schema in which the object is present.


Example:
A table named university has two schemas: student and lecturer
If suppose, the marks of the students has to be transferred to the lecturer schema, the query is
as follows –
Alter schema student TRANSFER marks lecturer
This way, the marks are transferred to the lecturer schema.
For example, to add an attribute for keeping track of jobs of employees to the EMPLOYEE
base relation in the COMPANY schema , we can use the command:
ALTER TABLE COMPANY.EMPLOYEE ADD COLUMN Job VARCHAR(12);
• We must still enter a value for the new attribute Job for each individual EMPLOYEE
tuple. This can be done either by specifying a default clause or by using the UPDATE
command individually on each tuple.
• If no default clause is specified, the new attribute will have NULLs in all the tuples of
the
relation immediately after the command is executed; hence, the NOT NULL constraint is not
allowed in this case.
Alter Table - Alter/Modify Column
To change the data type of a column in a table, use the following syntax:
ALTER TABLE table_name
MODIFY column_name datatype;

2.Drop Schema:
The DROP command can be used to drop named schema elements, such as tables, domains,
or constraints. One can also drop a schema. For example, if a whole schema is no longer
needed, the Drop schema is used.
• There are two drop behavior options: CASCADE and RESTRICT.
• DROP SCHEMA COMPANY CASCADE;
• If the RESTRICT option is chosen in place of CASCADE, the schema is dropped only if it
has no elements in it; otherwise, the DROP command will not be executed.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

• To use the RESTRICT option, the user must first individually drop each element in the
schema, then drop the schema itself.
• The DROP TABLE command not only deletes all the records in the table if successful, but
also removes the table definition from the catalog. If it is desired to delete only the records
but to leave the table definition for future use, then the DELETE command should be used
instead of DROP TABLE.
• The DROP command can also be used to drop other types of named schema elements, such
as constraints or domains.
• For example, the following command removes the attribute Address from the
EMPLOYEE base table:
ALTER TABLE COMPANY.EMPLOYEE DROP COLUMN Address CASCADE;
• It is also possible to alter a column definition by dropping an existing default clause or by
defining a new default clause. The following examples illustrate this clause:
ALTER TABLE COMPANY.DEPARTMENT ALTER COLUMN Mgr_ssn DROP
DEFAULT;
ALTER TABLE COMPANY.DEPARTMENT ALTER COLUMN Mgr_ssn SET
DEFAULT ‘’333445555’;

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

CHAPTER 2:
Introduction to Transaction Processing Concepts and Theory
The concept of transaction provides a mechanism for describing logical units of database
processing. Transaction processing systems are systems with large databases and hundreds of
concurrent users executing database transactions. Examples of such systems include airline
reservations, banking, credit card processing, online retail purchasing, stock markets,
supermarket checkouts, and many other applications. Introduction to Transaction Processing
In this section we discuss the concepts of concurrent execution of transactions and recovery
from transaction failures.

5.2.1 Single-User versus Multiuser Systems


• One criterion for classifying a database system is according to the number of users who can
use the system concurrently.
• A DBMS is single-user if at most one user at a time can use the system, and it is multiuser if
many users can use the system— and hence access the database—concurrently.
• Single-user DBMSs are mostly restricted to personal computer systems; most other DBMSs
are multiuser.
• For example, an airline reservations system is used by hundreds of travel agents and
reservation clerks concurrently. Database systems used in banks, insurance agencies, stock
exchanges, supermarkets, and many other applications are multiuser systems. In these systems,
hundreds or thousands of users are typically operating on the database by submitting
transactions concurrently to the system.

Fig 21.2
• Multiple users can access databases—and use computer systems—simultaneously because of
the concept of multiprogramming, which allows the operating system of the computer to
execute multiple programs—or processes—at the same time.
• A single central processing unit (CPU) can only execute at most one process at a time.
However, multiprogramming operating systems execute some commands from one process,
then suspend that process and execute some commands from the next process, and so on.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

A process is resumed at the point where it was suspended whenever it gets its turn to use the
CPU again. Hence, concurrent execution of processes is actually interleaved, as illustrated in
Figure 21.1, which shows two processes, A and B, executing concurrently in an interleaved
fashion. Interleaving keeps the CPU busy when a process requires an input or output (I/O)
operation, such as reading a block from disk. The CPU is switched to execute another process
rather than remaining idle during I/O time.
Interleaving also prevents a long process from delaying other processes. If the computer
system has multiple hardware processors (CPUs), parallel processing of multiple processes is
possible, as illustrated by processes C and D in Figure 21.1.
5.2.2 Transactions, Database items, Read & Write operations, DBMS buffers
A transaction is an executing program that forms a logical unit of database processing. It
includes one or more DB access operations such as insertion, deletion, modification, or
retrieval. Transactions can be embedded within an application program using begin
transaction and end transaction statements, or specified interactively via a high-level query
language like SQL. Read-only transactions do not update the database, while read-write
transactions do. The size of a data item is called its granularity.
Basic DB access operations include read_Item(X) to read a DB item into a program variable
and write_item(X) to write a program variable's value into the DB
item. Executing read_item(X) involves finding the disk block address, copying the block to a
buffer, and copying the item from the buffer to a program
variable. Executing write_item(X) involves finding the disk block address, copying the block
to a buffer, copying the item from the program variable to the buffer, and storing the updated
block back to disk. The recovery manager handles when to store modified blocks. A DB cache
includes data buffers, and when these are full, a buffer replacement policy, such as LRU, is
used.

The above transactions, specifically focusing on read and write operations. It illustrates two
example transactions, T1 and T2. Transaction T1 reads items X and Y, modifies them, and then
writes them back, while transaction T2 reads and modifies item X. The image also introduces
the concepts of read-sets and write-sets, explaining that a read-set comprises all items a
transaction reads, and a write-set includes all items a transaction writes. For example, in
transaction T1, both the read-set and write-set are {X, Y}.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

5.2.3 Why Concurrency Control is needed:


a)The Lost Update Problem: This occurs when two transactions that access the same database
items have their operations interleaved in a way that makes the value of some database item
incorrect.

shows a transaction T1 that transfers N reservations from one flight whose number of reserved
seats is stored in the database item named X to another flight whose number of reserved seats
is stored in the database item named Y.
b)The Temporary Update (or Dirty Read) Problem: This problem occurs when one
transaction updates a database item and then the transaction fails for some reason.

c) The Incorrect Summary Problem: If one transaction is calculating an aggregate summary


function on a number of database items while other transactions are updating some of these
items, the aggregate function may calculate some values before they are updated and others
after they are updated.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

d) The Unrepeatable Read Problem: Another problem that may occur is called unrepeatable
read, where a transaction T reads the same item twice and the item is changed by another
transaction T′ between the two reads.
Why Recovery Is Needed ?
Whenever a transaction is submitted to a DBMS for execution, the system is responsible for
making sure that either all the operations in the transaction are completed successfully and their
effect is recorded permanently in the database, or that the transaction does not have any effect
on the database or any other transactions.
Types of Failures:
Failures are generally classified as transaction, system, and media failures. There are several
possible reasons for a transaction to fail in the middle of execution:
1. A computer failure (system crash).
A hardware, software, or network error occurs in the computer system during transaction
execution. Hardware crashes are usually media failures—for example, main memory failure.
2. A transaction or system error – Inability of a transaction to complete due to internals errors
or external system issues like logical errors , syntax errors ,system crash, Disk failure etc..
3. Local errors or exception conditions detected by the transaction.
• During transaction execution, certain conditions may occur that necessitate cancellation
of the transaction.
4. Concurrency control enforcement.
The concurrency control method may decide to abort the transaction, to start again because it
basically violates serializability or we can say that several processes are in a deadlock.
5.Disk failure
• Some disk blocks may lose their data because of a read or write malfunction or because of
a disk read/write head crash , Bad sectors (damaged areas of disk.
• 6. Physical problems and catastrophes.
• This refers to an endless list of problems that includes power or air-conditioning failure,
fire, theft, overwriting disks or tapes by mistake, and mounting of a wrong tape by the
operator.

5.3 Transaction and System Concepts

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

Transaction States and Additional Operations


 BEGIN_TRANSACTION.:This marks the beginning of transaction execution.
 READ or WRITE.:These specify read or write operations on the database items that are
 executed as part of a transaction.
 END_TRANSACTION.:This specifies that READ and WRITE transaction operations
have ended and marks the end of transaction execution.
 COMMIT_TRANSACTION: This signals a successful end of the transaction so that any
changes (updates) executed by the transaction can be safely committed to the database and
will not be undone.
 ROLLBACK (or ABORT):This signals that the transaction has ended unsuccessfully, so
that any changes or effects that the transaction may have applied to the database must be
undone.

5.3.2 The SYSTEM LOG


• Log or Journal: The log keeps track of all transaction operations that affect the values of
database items.
• This information may be needed to permit recovery from transaction failures.
• The log is kept on disk, so it is not affected by any type of failure except for disk or
catastrophic failure.
• In addition, the log is periodically backed up to archival storage (tape) to guard against such
catastrophic failures.
• Protocols for recovery that avoid cascading rollbacks do not require that read operations
be written to the system log, whereas other protocols require these entries for recovery.
Strict protocols require simpler write entries that do not include new_value .

The following lists the type of entities called log records:


1.[start transaction, T].Indicate that transactions T has started execution.
2.[write_item, T, X,old_value, new_value] Indicates that transactions T has changed the value
of database item X from old_value to new_value.
3.[read_item,T,x]Indicates that Transaction T has a read the value of database item X
4.[Commit, T] Indicates that transactions T has completed successfully and affirms that its
effect can be commited to the database.
5.[abort, T] Indicates that transaction T as aborted.

5.3.3. COMMIT POINT TRANSACTION


• A transaction T reaches its commit point when all its operations that access the database
have been executed successfully and the effect of all the transaction operations on the
database have been recorded in the log.
• Beyond the commit point, the transaction is said to be committed, and its effect must be
permanently recorded in the database.
• Transactions that have written their commit record in the log must also have recorded all
their WRITE operations in the log, so their effect on the database can be redone from the
log records.
• At the time of a system crash, only the log entries that have been written back to disk are
considered in the recovery process if the contents of main memory are lost.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

• Hence, before a transaction reaches its commit point, any portion of the log that has not
been written to the disk yet must now be written to the disk.
• This process is called force-writing the log buffer to disk before committing a transaction.

5.3.4 DBMS-Specific Buffer Replacement Policies


The DBMS cache will hold the disk pages that contain information currently being processed
in main memory buffers.
❖ If all the buffers in the DBMS cache are occupied and new disk pages are required to be
loaded into main memory from disk, a page replacement policy is needed to select the particular
buffers to be replaced.
❖ Some page replacement policies that have been developed specifically for database systems
are briefly discussed next.
Domain Separation (DS) Method-DBMS contains many disk pages like index pages, data file
pages, log file pages, etc. DBMS cache is divided into separate domains(set of buffers).
Each domain handles one type of disk pages, and page replacements within each domain are
handled via the basic LRU (Least Recently Used) page replacement.
Hot Set Method: This is useful in queries that have to scan a set of pages repeatedly, such as
where a join operation is performed using the nested-loop method.
If inner loop file is loaded completely into main memory buffers without replacement (the hot
set), the join will be performed efficiently because each page in the outer loop file will have
to scan all records in the inner loop file to find join matches.
The DBMIN Method :This uses a model known as QLSM (Query Locality Set Model),
which predetermines the pattern of page references for each algorithm for a particular type of
database operation.
This method calculates locality set using QLSM for each file instance involved in query.
Then it allocates appropriate no.of buffers to each file instance involved in query-based on
the locality set for that file instance.

5.4 Desirable Properties of Transactions


 Atomicity − This property states that a transaction is an atomic unit of processing, that is, either
it is performed in its entirety or not performed at all. No partial update should exist.
 Consistency − A transaction should take the database from one consistent state to another
consistent state. It should not adversely affect any data item in the database.
 Isolation − A transaction should be executed as if it is the only one in the system. There should
not be any interference from the other concurrent transactions that are simultaneously running.
 Durability − If a committed transaction brings about a change, that change should be durable
in the database and not lost in case of any failure.
Levels of Isolation.
❖ There have been attempts to define the level of isolation of a transaction.

❖ A transaction is said to have level 0 (zero) isolation if it does not overwrite the dirty reads
of higher-level transactions.

❖ Level 1 (one) isolation has no lost updates, and level 2 isolation has no lost updates and no
dirty reads.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

❖ Finally, level 3 isolation (also called true isolation) has, in addition to level 2 properties,
repeatable reads.

5.5 Characterizing Schedules Based on Recoverability

Schedules classified under recoverability

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

5.6 Characterizing Schedules Based on Serializability

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

5.6.1 Testing conflict serializability of a Schedule S


1.Identify Transactions and Operations: List down all transactions involved in schedule S,
and identify the operations (read/write) performed by each transaction.
2.Construct the precedence graph (or conflict graph):
Create a node for each transaction in the schedule.
For each pair of conflicting operations (operations from different transactions that access the
same data item, where at least one of them is a write), draw a directed edge from the transaction
containing the earlier operation to the transaction containing the later operation.
3. Check for Cycle in the Graph:
Once you have constructed the precedence graph, check if there is any cycle in this graph.
If there is no cycle, then the schedule S is conflict serializable.
If there is a cycle, then the schedule S is not conflict serializable.

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

5.7 Transaction Support in SQL


• Transaction Start: Happens implicitly (e.g., when a query is run).
• Transaction End: Must be explicit, using either:
• COMMIT (to save changes)
• ROLLBACK (to undo changes)
Transaction Characteristics (via SET TRANSACTION)
1. Access Mode
• READ WRITE (default): Allows read and write operations.
• READ ONLY: Only allows reading data.
2. Diagnostic Area Size
• DIAGNOSTIC SIZE n: Saves error info for the last n SQL statements.
3. Isolation Level (controls data visibility across transactions):
• READ UNCOMMITTED:
• Most relaxed, allows all issues (dirty reads, etc.)
• READ COMMITTED:
• Prevents dirty reads
• REPEATABLE READ:
• Prevents dirty and nonrepeatable reads
• SERIALIZABLE (default):
Most strict, prevents all issues

A sample SQL transaction might look like the following:


EXEC SQL WHENEVER SQL ERROR GOTO UNDO;
EXEC SQL SET TRANSACTION
READ WRITE
DIAGNOSTIC SIZE 5
ISOLATION LEVEL SERIALIZABLE;
EXEC SQL INSERT INTO EMPLOYEE
(Fname, Lname, Ssn, Dno, Salary) VALUES ('Robert', 'Smith', '991004321', 2, 35000);
EXEC SQL UPDATE EMPLOYEE
SET Salary = Salary * 1.1 WHERE Dno = 2;
EXEC SQL COMMIT;
GOTO THE_END;
UNDO: EXEC SQL ROLLBACK;

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore


Module 4

THE_END: ... ;
1. The above transaction consists of first inserting a new row in the EMPLOYEE table and
then updating the salary of all employees who work in department 2.
2. If an error occurs on any of the SQL statements, the entire transaction is rolled back. This
implies that any updated salary (by this transaction) would be restored to its previ-ous value
and that the newly inserted row would be removed.
3. As we have seen, SQL provides a number of transaction-oriented features. The DBA or
database programmers can take advantage of these options to try improving

Navyashree KS , CSE(DS), Asst Prof RNSIT, Bangalore

You might also like