Module 4 DBMS
Module 4 DBMS
Chapter 1
SQL: Advanced Queries
4.1 MORE COMPLEX SQL QUERIES
1. Comparisons Involving NULL and Three-Valued Logic:
NULL is used to represent a missing value that usually has one of the 3 different
interpretations.
value unknown (exists but is not known or it is not known whether a value exists or not),
value not available (exists but is purposely withheld), or o attribute not applicable
(undefined for this tuple).
Examples:
1. Unknown value: A particular person has a date of birth but it is not known, so it is represented
by NULL in the database.
2. Unavailable or withheld value: A person has a home phone but does not want it to be listed,
so it is withheld and represented as NULL in the database.
3. Not applicable attribute: An attribute LastCollegeDegree would be NULL for a person who
has no college degrees, because it does not apply to that person.
SQL does not distinguish between the different meanings of NULL.
In general, each NULL value is considered to be different from every other NULL value in
the database records.
When a record with NULL is involved in a comparison operation, the result is considered to
be UNKNOWN (it may be TRUE or it may be FALSE). Hence, SQL uses a 3-valued logic
with values TRUE, FALSE and UNKNOWN.
The results or truth values of three-valued logical expressions when the logical connectives
AND, OR and NOT are used are showed in the table below.
AND TRUE FALSE UNKNOWN
NOT
TRUE FALSE
FALSE TRUE
UNKNOWN UNKNOWN
In (a) and (b), the rows and columns represent the values of the results of 2 three valued
Boolean expressions which would appear in the WHERE clause of an SQL query. Each
expression result would have a value of true, false, or unknown.
In select-project-join queries, the general rule is that only those combinations of tuples that
evaluate the logical expression of the query to TRUE are selected. Tuple combinations that
evaluate to FALSE or UNKNOWN are not selected.
There are exceptions to that rule for certain operations such as outer joins.
SQL allows queries that check whether an attribute value is NULL.
SQL uses the comparison operators IS or IS NOT to compare an attribute value to NULL.
SQL considers each NULL value as being distinct from every other NULL value, so
equality comparison is not appropriate. It follows that when a join condition is specified,
tuples with NULL values for the join attributes are not included in the result (unless it is an
outer join).
Query 1: Retrieve the names of all employees who do not have supervisors.
SELECT Fname, Lname FROM EMPLOYEE
WHERE Super ssn IS NULL;
2. Nested Queries, Tuples, and Set / Multiset Comparisons:
Some queries require that existing values in the database be fetched and then used in a
comparison condition.
Nested queries are complete select-from-where blocks within another SQL query. That other
query is called the outer query. The nested queries can also appear in the WHERE clause of the
FROM clause or other SQL clauses as needed.
The comparison operator IN compares a value v with a set (or multiset) of values V and
evaluates to TRUE if v is one of the elements in V.
Query 2: Retrieve the project numbers of all projects that either:Belong to a department
managed by an employee whose last name is 'Smith',
OR
Have at least one employee with the last name 'Smith' working on them. Display only
distinct project numbers.
If a nested query returns a single attribute and a single tuple, the query result will be a single
value. In such cases, = can be used instead of IN for the comparison operator. In general, a
nested query will return a table (relation) which is a set or multiset of tuples.
SQL allows the use of tuples of values in comparisons by placing them in parentheses. This
is illustrated in the query below.
Query 3: Retrieve the Ssns of all employee who work on the same (project, hours)
combination on some project that employee 'John Smith' whose Ssn is 123456789 works
on.
SELECT D I ST IN CT E s sn
FROM WORKS ON
W HE R E ( P n o , H o u r s ) I N ( SE L E C T P n o , H o u r s
FROM WORKS ON
W H E R E Ssn = ' 1 2 3 4 5 6 7 8 9 ' ) ;
In this example, the IN operator compares the sub-tuple of values in parentheses (Pno, Hours)
for each tuple in WORKS_ON with the set of union compatible tuples produced by the nested
query.
A number of comparison operators can be used to compare a single value v (typically an
attribute name) to a set or multiset V (typically a nested query). The =ANY (or =SOME)
operator returns TRUE if the value v is equal to some value in the set :V and is hence equivalent
to IN The keywords ANY and SOME have same meaning. Operators that can be combined
with ANY include >, >= < >. The keyword ALL can also be combined with each of these
operators. The comparison condition (v >ALL V) returns TRUE if value 'v is greater than all
the values in the set (or multiset) V.
Query 4: Retrieve the names of employees whose salary is greater than the salary of all
the employees in department 5.
SELECT Ln a m e , Fn a m e FROM EMPLOYEE
WHERESalary>ALL(SELECT S alary
FROM WHERE EMPLOYEE Dno = 5 ) ;
If attributes of the same name exist, one in the FROM clause of the nested query and one in the
FROM clause of the outer query, then there arises ambiguity in attribute names. The rule is that
the reference to an unqualified attribute refers to the relation declared in the innermost nested
query. It is generally advisable to create tuple variables (aliases) for all the tables referenced in
an SQL query to avoid errors and ambiguities.
Query 5: Retrieve the name of each employee who has a dependent with the same first
name and sex as the employee.
SELECT E . Ln a m e, E . Fn a m e
FROM EMPLOYEE AS E
WHERE E . S s n I N ( SELECT E s s n
FROM DEPENDENT WHERE
E.Fname=Dep en dentname AND E .Sex=Sex);
3. Correlated Nested Queries:
Whenever a condition in the WHERE clause of a nested query references some attribute of a
relation declared in the outer query, the two queries are said to be correlated.
For a correlated nested query, the nested query is evaluated once for each tuple (or
combination of tuples) in the outer query.
Example: In Query 5, for each EMPLOYEE tuple, evaluate the nested query, which
retrieves the Essn values for all DEPENDENT tuples with the same sex and name as that
EMPLOYEE tuple; if the SSN value of the EMPLOYEE tuple is in the result of the nested
query, then select that EMPLOYEE tuple.
In general, a query written with nested select-from-where blocks and using the = or IN
comparison operators can always be expressed as a single block query. The query below is
another way of solving Query 5.
SELECT E . Fn a m e, E . Ln a m e
FROM EMPLOYEE AS E, DEPENDENT AS D
WHERE E . Ssn = D. E ssn AND E . Sex= D. Sex
AN D E . F n a m e = D . D e p e n d e n t n a m e ;
4. The EXISTS and UNIQUE Functions in SQL: EXISTS and UNIQUE are Boolean functions
that return TRUE or FALSE.
They can be used in WHERE clause condition.
The EXISTS function in SQL is used to check whether the result of a correlated nested query
is empty (contains no tuples) or not.
The result of EXISTS is a Boolean value TRUE if the nested query result contains atleast one
tuple or FALSE if the nested query result contains no tuples.
EXISTS and NOT EXISTS are typically used in conjunction with a correlated nested query.
EXISTS (Q) returns TRUE if there is atleast one tuple in the result of the nested query Q
and returns FALSE otherwise.
NOT EXISTS ( Q) returns TRUE if there are no tuples in the result of the nested query Q
and returns FALSE otherwise.
Query 5 can be written in an alternative form that uses EXISTS. The nested query
references the Ssn, Fname, and Sex attributes of the EMPLOYEE relation from the outer
query. For each employee tuple, evaluate the nested query which retrieves all
DEPENDENT tuples with the same Essn, Sex and Dependent_name as the employee
tuple; if atleast 1 tuple EXISTS in the result of the nested query, then select that employee
tuple.
SE L E C T E . F n a m e , E . L n a m e FROM EMPLOYEE AS E WHERE EXISTS ( SELECT
* FROM DEPENDENT AS D WHE R E E . S s n = D . E s s n AND E . S e x = D . S e x AND
E.Fname=D.Dependentname);
Query 6: Retrieve the names of employees who have no dependents.
SELECT FNAME, LNAME FROM EMPLOYEE WHERE N OT EX I ST S ( SELE CT *
FROM DEPENDENT WHERE SSN=ESSN);
In this query, the correlated nested query retrieves all DEPENDENT tuples related to an
EMPLOYEE tuple. If none exist, the EMPLOYEE tuple is selected because the WHERE
clause condition will evaluate to TRUE in this case. For each employee tuple, the correlated
nested query selects all DEPENDENT tuples whose Essn value matches the EMPLOYEE Ssn;
if the result is empty, no dependents are related to the employee and so we select that employee
tuple.
Query 7: List the names of managers who have atleast one dependent.
SELECT FNAME, LNAME FROM EMPLOYEE WHEREEXISTS ( SELECT * FROM
DEPENDENT WHERE SSN=ESSN) AND EXISTS (SELECT FROM DEPARTMENT
WHERE S s n = M g r s s n ) ;
Query 8: Retrieve the name of each employee who works on all the projects controlled by
department number 5.
SELECT Fna me, Lna me FROM EMPLOYEE WHEREN OT E X I ST S ( ( S E L E C T P n
u m b e r F R OM P R O J E C T WHERE D num= S) EXCEPT (SELECT Pno FROM WORKS
ON WHERE Ssn=Essn)) ;
The first subquery (which is not correlated with the outer query) selects all projects controlled
bydepartment 5 and the second subquery (which is correlated with the outer query) selects all
projects that the particular employee being considered works on. If the set difference of the
first subquery result MINUS EXCEPT) the second subquery result is empty, it means that the
employee works on all the projects and is therefore selected.
The function UNIQUE (Q) returns TRUE if there are no duplicate tuples in the result of query
Q; otherwise it returns FALSE
. The UNIQUE function can be used to test whether the result of a nested query is a set — no
duplicates or a multiset — duplicates exist.
5. Explicit sets and Renaming of Attributes in SQL:
An explicit set of values can be used in the where clause rather than a nested query. Such a
set is enclosed in parentheses in SQL.
Query 8 : Retrieve the Essn of all employees who work on project numbers 1, 2 or 3.
SELECT DISTINCT Essn
FROM WORKS ON WHERE Pno IN (1,2,3);
In SQL, it is possible to rename any attribute that appears in the result of a query by adding
the AS qualifier followed by the desired new name.
AS construct can be used to alias both attribute and relation names in general and it can be
used in appropriate parts of a query.
SELECT E.Lname AS Employee name, S.Lname AS Supervisor name
FROM EMPLOYEE AS E, EMPLOYEE AS S
WHERE E.Superssn=S.Ssn;
6. Joined Tables in SQL:
The joined table (or joined relation) permits users to specify a table resulting from a join
operation in the FROM clause of a query.
This construct avoids mixing together all the select and join conditions in the WHERE clause.
Example: Consider query which retrieves the name and address of every employee who works
for the 'Research' department. First specify the join of the EMPLOYEE and DEPARTMENT
relations and then select the desired tuples and attributes. The FROM clause contains a single
joined table.
SELECT F n a m e , L n a m e , A d d r e s s FROM (EMPLOYEE JOIN DEPARTMENT ON
Dno=Dnumber) WHERE D n a m e = ' R e s e a r c h ' ;
The attributes of such a table are all the attributes of the first table EMPLOYEE followed by
all attributes of the second table DEPARTMENT. In a NATURAL JOIN on two relations R
and S, no join condition is specified; an implicit EQUIJOIN condition for each pair of attributes
with the same name from R and S is created. Each such pair of attributes is included only once
in the resulting relation.
If the names of join attributes are not the same in base relations, rename the attributes so that
they match and then apply the NATURAL JOIN. The AS construct can be used to rename a
relation and all its attributes in the FROM clause.
Example: Here the DEPARTMENT relation is renamed as DEPT and its attributes are renamed
as Dname , Dno, Mssn and Msdate. The implied join condition for this natural join is
EMPLOYEE. Dno = DEPT. Dno because it is the only pair of attributes with the same name.
Example 1:
SELECT Orders.OrderID, Employees.LastName, Employees.FirstName
FROM Orders
RIGHT JOIN Employees ON Orders.EmployeeID = Employees.EmployeeID
ORDER BY Orders.OrderID;
Join specifications can be nested where one of the tables in a join may itself be a joined table.
This allows the specification of the join of three or more tables as a single joined table which
is called a multiway join.
Example: SELECT PNUMBER, DNUM, LNAME, BDATE, ADDRESS
FROM((PROJECT JOIN DEPARTMENT ON DNUM=DNUMBER) JOIN
EMPLOYEE ON MGRSSN=SSN) WHERE PLOCATION='Stafford';
Some SQL implementations have a different syntax to specify outer joins by using the
comparison operators += for left outer join, =+ for right outer join and +=+ for full outer join
when specifying the join condition.(Eg: Oracle uses this syntax)
Example: SELECT E .Lname, S.Lname
FROM EMPLOYEE E, EMPLOYEE S W H E R E
E.Superssn+=S.Ssn;
7. Aggregate Functions in SQL:
Aggregate functions are used to summarize information from multiple tuples into a single-
tuple summary.
Grouping is used to create subgroups of tuples before summarization.
SQL has built-in aggregate functions - COUNT, SUM, MAX, MIN and AVG.
The COUNT function returns the number of tuples or values as specified in a query.
The functions SUM, MAX, MIN and AVG are applied to a set or multiset of numeric values
and return the sum, the maximum value, the minimum value and the average of those values
respectively.
These functions can be used in the SELECT clause or in a HAVING clause.
The functions MAX and MIN can also be used with attributes that have nonnumeric domains
if the domain values have a total ordering among one another.
NULL values are discarded when aggregate functions are applied to a particular column
(attribute). COUNT (*) counts tuples not values hence NULL values do not affect it.
When an aggregate function is applied to a collection of values, NULLs are removed from
the collection before the calculation. If the collection becomes empty because all values are
NULL,
the aggregate function will return NULL except COUNT which returns a 0 for an empty
collection of values.
Aggregate functions can also be used in selection conditions involving nested queries. A
correlated nested query with an aggregate function can be specified and then used in the
WHERE clause of an outer query.
SQL also has aggregate functions SOME and ALL that can be applied to a collection of
Boolean values. SOME returns TRUE if atleast one element in the collection is TRUE whereas
ALL returns TRUE if all elements in the collection are TRUE.
Query 10: Find the sum of salaries, maximum salary, the minimum salary, and the
average salary among all employees.
SELECT SUM(SALARY), MAX(SALARY), MIN(SALARY), AVG(SALARY)
FROM EMPLOYEE;
This query returns a single-row summary of all the rows in the EMPLOYEE table. We can
use the keyword AS to rename the column names in the resulting single-row table.
SELECT SUM(SALARY) AS Total_salary, MAX(SALARY) AS Highest _salary,
MIN(SALARY) AS Lowest salary, AVG(SALARY)AS Average salary FROM EMPLOYEE;
Query11: Find the sum of the salaries, maximum salary, the minimum salary, and the
average salary among employees who work for the 'Research' department.
SELECT SUM (SALARY) , MAX(SALARY), MIN( SALARY), AVG ( SALARY) FROM
EMPLOYEE, DEPARTMENT WHERE DNO=DNUMBER AND DNAME='Research';
Query12: Retrieve the total number of employees in the company.
SE L E C T C OU N T (* ) FROM EMPLOYEE;
Query13: Retrieve the total number of employees in the 'Research' department.
SELECT COUNT (*) FROM EMPLOYEE, DEPARTMENT WHEREDNO=DNUMBER
AND DNAME='Research';
Here the asterisk (*) refers to the rows (ttiples), so COUNT (*) returns the number of rows in
the result of the query. The COUNT function can also be used to count values in a column
rather than tuples.
Query14 : Count the number of distinct salary values in the database.
SELECT CO UNT ( DISTINCT S a l a r y) FROM EMPLOYEE; COUNT (Salary) will not
eliminate duplicate values of Salary. Any tuples with NULL for Salary will not be counted.
Query 15: Retrieve the names of all employees who have two or more dependents.
SELE CT L n a m e , F n a m e FROM EMPLOYEE WHERE ( SELECT COUNT (*) FROM
DE PE NDE NT WH ERE S s n = E s s n ) > = 2 ;
The correlated nested query counts the number of dependents that each employee has. If the
count is greater than or equal to two, the employee tuple is selected.
Query 18 : For each project on which more than two employees work, retrieve the project
number, project name, and the number of employees who work on that project.
SELECT P n u m b e r , P n a m e , C OU N T (* ) FROM PROJECT, WORKS ON WHERE
Pn u m ber = Pn o GR OU P BY P n u m b e r , P n a m e HAVING C OU N T (*) > 2 ;
Query 19: For each project, retrieve the project number, the project name and the
number of employees from department 5 who work on the project.
SELECT P n u m b e r , P n a m e , C OU N T (* ) FROM PROJECT, WORKS ON, EMPLOYEE
WHERE P n um be r = Pn o A N D S sn = E s sn A N D Dn o= 5 GROUP BY P n u m b e r , P
name;
Query 20: For each department that has more than 5 employees, retrieve the department
number and the number of employees who are making a salary more than $40,000.
SELECT Dno, COUNT(*)
FROM EMPLOYEE WHERE S a l a r y > 4 0 0 0 0 A N D D n o IN (SELECT D n o FROM
EMPLOYEE GROUP BY Dno HAVING COUNT(*)>5) GROUP BY Dno;
9. SQL Constructs: WITH and CASE
The WITH clause allows a user to define a table that will only be used in a particular query.
This table will be dropped after its use in that query.
Queries using WITH can generally be written using other SQL constructs.
Example: WITH LARGE_DEPTS (Dno) AS (SELECT Dno FROM EMPLOYEE SELECT
GROUP BY Dno HAVING COUNT(*) > 5) Dno, COUNT(*) FROM WHERE EMPLOYEE
S a l a r y> 4 00 0 0 AND D no IN LARGE DEPTS GROUP BY D n o ;
Here a temporary table LARGE_DEPTS is defined using the WITH clause whose result holds
the Dnos of departments with more than 5 employees. This table is then used in the subsequent
query. Once this query is executed the temporary table LARGE_DEPTS is discarded.
The SQL CASE construct can be used when a value can be different based on certain
conditions.
It can be used in any part of an SQL query where a value is expected, including when
querying, inserting or updating tuples.
Example: Suppose we want to give employees different raise amounts depending on which
department they work for. Employees in department 5 get a $2000 raise, those in department 4
get $1500 and those in department 1 get $3000. We can write the update operation as:
UPDATE EMPLOYEE SET Salar y = CASE WHEN D n o = 5 THEN S a l a r y + 2 0 0 0
WHEN D no = 4 THEN S a l a r y + 1500 WHEN D no = 1 THEN S a l a r y + 3 0 0 0 ELSE
S a l a r y+ 0 ;
Here the salary raise value is determined through the CASE construct based on the department
number for which each employee works.
The CASE construct can also be used when inserting tuples that can have different attributes
being NULL depending on the type of record being inserted into a table, as when a
specialization is mapped into a single table or when a union type is mapped into a relation.
10. Recursive Queries in SQL
A recursive query can be written in SQL using WITH RECURSIVE construct. It allows users
the capability to specify a recursive query in a declarative manner.
A recursive relationship between tuples of the same type is the recursive relationship between
an employee and supervisor. This relationship is described by the foreign key S up r_s sn of
the EMPLOYEE relation.
An example of a recursive operation is to retrieve all supervisees of a supervisor employee
e at all levels — all employees e' directly supervised by e, all employees e' directly
supervised by each employee e' , all employees e" ' directly supervised by each employee
e' and so on.
WITH RECURSIVE SUP_EMP ( S u p s sn , E m p ssn ) AS ( SELECT S u p e r s n , S s n
FROM EMPLOYEE UNION S E L E C T E . S s n , S . S u p s s n FROM E MPL OYE E AS
E , S UP E M P AS S WHEREE . S u p e r s s n = S . E m p s s n ) SELECT * FROM SUP_EMP;
Here the view SUP_EMP will hold the result of the recursive query. The view is initially empty.
It is first loaded with the first level (Supervisor, supervisee) Ssn combinations through the first
part (SELECT Super ssn, Ssn FROM EMPLOYEE) which is called the base query. This will
be combined via UNION with each successive level of supervisees through the second part,
where the view contents are joined again with the base values to get the second level
combinations which are UNIONed with the first level. This is repeated with successive levels
until a fixed pint is reached where no more tuples are added to the view. At this point the result
of the recursive query is in the view SUP EMP.
Select SQL Statement: A query in SQL can consist of up to six clauses, but only the first two
SELECT and FROM, are mandatory. The clauses are specified in the following order:
The WHERE clause specifies the conditions for selection of tuples from these relations
including join conditions if needed.
GROUP BY specifies grouping attributes whereas HAVING specifies a condition on the
groups being selected rather than on the individual tuples.
The built in aggregate functions COUNT, SUM, AVG, MIN and MAX are used in conjunction
with grouping but they can also be applied to all the selected tuples in a query without the group
by clause.
:* ORDER BY specifies an order for displaying the result of a query. It is applied at the end
to sort the query result.
A query is evaluated conceptually by first applying the FROM clause followed by the
WHERE clause and then by the GROUP BY and HAVING.
1.For each department, retrieve the department number, the number of employees in the
department, and their average salary
SELECT Dno, COUNT (*), AVG (Salary) FROM EMPLOYEE GROUP BY Dno;
2.For each project, retrieve the project number, the project name, and the number of
employees who work on that project.
SELECT Pnumber, Pname, COUNT (*) FROM PROJECT, WORKS_ON WHERE Pnumber
= Pno GROUP BY Pnumber, Pname;
The HAVING clause was added to SQL because the WHERE keyword cannot be used with
aggregate functions. (Filter groups on aggregate values)
1. For each project on which more than two employees work, retrieve the project number,
the project name, and the number of employees who work on the project
SELECT Pnumber, Pname, COUNT (*) FROM PROJECT, WORKS_ON WHERE Pnumber
= Pno GROUP BY Pnumber, Pname HAVING COUNT (*) > 2;
Other actions may be specified, such as executing a specific stored procedure or triggering
other updates.
Example: Suppose we want to check whenever an employee's salary is greater than the salary
of his or her direct supervisor in the COMPANY database. Several events can trigger this rule:
inserting a new employee record, changing an employee's salary, or changing an employee's
supervisor. Suppose that the action to take would be to call an external stored procedure
SALARYLVIOLAT ION, which will notify the supervisor. The trigger could then be written
as below.
CREATE TRIGGER SALARYLVIOLATION BEFORE INSERT OR UPDATE OF SALARY,
SUPERVISOR SSN ON EMPLOYEE FOR EACH ROW WHEN (NEW. SALARY >
(SELECT SALARY FROM EMPLOYEE WHERE SSN = NEW.SUPERVISOR SSN))
INFORM SUPERVISOR(NEW.Supervisor ssn, NEW.Ssn );
The trigger is given the name SALARY VIOLATION, which can be used to remove or
deactivate the trigger later.
A typical trigger which is regarded as an ECA (Event, Condition, Action) rule has three
components:
The event(s): These are usually database update operations that are explicitly applied to the
database. The person who writes the trigger must make sure that all possible events are
accounted for. In some cases, it may be necessary to write more than> one trigger to cover all
possible cases. These events are specified after the keyword BEFORE, which means that the
trigger should be executed before the triggering operation is executed. An alternative is to use
the keyword AFTER, which specifies that the trigger should be executed after the operation
specified in the event is completed.
1. The condition that determines whether the rule action should be executed: Once the
triggering event has occurred, an optional condition may be evaluated. If no condition is
specified, the action will be executed once the event occurs. If a condition is specified, it is
first evaluated, and only if it evaluates to true will the rule action be executed. The condition
is specified in the WHEN clause of the trigger.
2. The action to be taken: The action is usually a sequence of SQL statements, but it could
also be a database transaction or an external program that will be automatically executed.
In this example, the action is to execute the stored procedure INFORM SUPERVISOR.
Triggers can be used in various applications, such as maintaining database consistency,
monitoring database updates, and updating derived data automatically.
A trigger specifies an event, a condition and an action. The action is to be executed
automatically if the condition is satisfied when the event occurs.
The view DEPT INFO explicitly specifies new attribute names using a one to one
correspondence between the attributes specified in the CREATE VIEW clause and those
specified in the SELECT clause of the query that defines the view.
Example: To retrieve the last name and first name of all employees who work on `ProductX'
project.
QV: SELECT Fname, Lname FROM WORKS ONI WHERE Pname=' ProductX' ;
Advantages of view:
It simplifies the specification of certain queries.
It is also used as a security and authorization mechanism.
A view should always be up- to date i.e., if we modify the tuples in the base tables on which
the view is defined, the view must automatically reflect these changes. Hence a view is not
realized at the time of view definition but when we specify a query on the view. + It is the
responsibility of the DBMS to ensure that a view is up-to-date and not of the user to ensure
that the view is up-to-date. If a view is not needed, it can be removed by DROP VIEW
command. Eg: DROP VIEW WORKS ON VIEW;
3. View Implementation and View Update:
Two main approaches have been suggested to know how efficiently DBMS implements a
view for efficient querying.
The strategy of query modification involves modifying or transforming the view query into
a query on the underlying base tables.
Example: The query QV would automatically be modified to the following query by the
DBMS.
SELECT FNAME, LNAME FROM EMPLOUEE, PROJECT, WORKS ON WHERE
SSN=ESSN AND PNO=PNUMBER AND PNAME='ProjectX';
The disadvantage of this approach is that it is inefficient for views defined via complex
queries that are time consuming to execute, especially if multiple queries are applied to the
view within a short time.
The other strategy, view materialization involves physically creating a temporary view table
when the view is first queried and keeping that table on the assumption that other queries on
the view will follow.
Here, an efficient strategy to automatically update the view when the base tables are updated
must be developed to keep the view up- to- date. Incremental update has been developed to
determine what new tuples must be inserted, deleted or modified in a materialized view table
when a change is applied to one of the defining base tables. The view is generally kept as a
materialized (physically stored) table as long as it is being queried. If the view is not queried
for a certain period of time, the system may then automatically remove the physical table and
recomputed from scratch when future queries reference the view.
Different strategies as to when a materialized view is updated are possible.
immediate update strategy updates a view as soon as the base tables are changed. lazy
update strategy updates the view when needed by a view query.
periodic update strategy updates the view periodically (in the latter strategy, a view query
may get a result that is not up-to-date). + A retrieval query against any view can always be
issued. But issuing an INSERT, DELETE, or UPDATE command on a view table is in many
cases not possible.
In general, an update on a view defined on a single table without any aggregate functions can
be mapped to an update on the underlying base table under certain conditions.
For a view involving joins, an update operation may be mapped to update operations on the
underlying base relations in multiple ways. Hence, it is not possible for the DBMS to determine
which of the updates is intended.
Example: Suppose that we issue the command to update the Pname attribute of 'John Smith'
from ProductX' to 'Productr in the view WORKS_ONVIEW.
This view update is shown in
UV1: UV': UPDATE WORKS ON1 SET P n a m e = ' Pr od u c t Y ' WHERE L n a m e = ' S
m i t h ' AND F n a m e = ' J o h n ' A N D P n a m e = ' P r o d u c t X ' ;
This query can be mapped into several updates on the base relations to give the desired update
effect on the view. Some of these updates will create additional side effects that affect the result
of other queries. Two possible updates, (a) and (b), on the base relations corresponding to UVI
are shown. Update (a) relates 'John Smith' to the 'ProductY' PROJECT tuple in place of the '
ProductX' PROJECT tuple and is the most likely desired update.
Update (b) would also give the desired update effect on the view, but it accomplishes this by
changing the name of the ' ProductX' tuple in the PROJECT relation to ' Product Y'. It is quite
unlikely that the user who specified the view update UVI wants the update to be interpreted as
in (b), since it also has the side effect of changing all the view tuples with Pname ' ProductX'.
Some view updates may not make much sense. For example, modifying the Total_Sal
attribute of the DEPT_INFO view does not make sense because Total Sal is defined to be the
sum of the individual employee salaries. This request is shown as UV2:
A view update is feasible when only one possible update on the base relations can accomplish
the desired update effect on the view.
Whenever an update on the view can be mapped to more than one update on the underlying
base relations, it is usually not permitted.
A view with a single defining table is updatable if the view attributes contain the primary key
of the base relation, as well as all attributes with the NOT NULL constraint that do not have
default values specified.
Views defined on multiple tables using joins are generally not updatable.
Views defined using grouping and aggregate functions are not updatable.
In SQL, the clause WITH CHECK OPTION should be added at the end of the view definition
if a view is to be updated by INSERT, DELETE, or UPDATE statements. This allows the
system to reject operations that violate the SQL rules for view updates.
It is also possible to define a view table in the FROM clause of an SQL query. This is known
as an in- line view.
4. Views as Authorization Mechanisms:
Views can be used to hide certain attributes or tuples from unauthorized users.
Suppose a certain user is only allowed to see employee information for employees who work
for department 5;
then we can create the following view DEPTEMP and grant the user the privilege to query the
view but not the base table EMPLOYEE itself.
This user will only be able to retrieve employee information for employee tuples whose Dno =
5, and will not be able to see other employee tuples when the view is queried.
CREATE VIEW DEPTEMP AS SELECT * FROM EMPLOYEE WHERE Dn o = 5 ;
A view can restrict a user to only see certain columns; for example, only the first name, last
name, and address of an employee may be visible as follows:
CREATE VIEW BASIC _EMP DATA AS SELECT F n a m e , L n a m e , A d d r e s s FROM
EMPLOYEE;
By creating an appropriate view and granting certain users access to the view and not the base
tables, they would be restricted to retrieving only the data specified in the view.
4.4 SCHEMA CHANGE STATEMENTS IN SQL
1.Alter Schema:
Alter is generally used to change the contents related to a table in SQL. In case of SQL
Server, alter_schema is used to transfer the securables/contents from one schema to another
within a same database.
Syntax:ALTER SCHEMA current_schema_name TRANSFER TO new_schema_name;
• target_schema name is the name of the schema in which the object/contents should be
transferred.
• TRANSFER is a keyword that transfers the contents from one schema to the other.
• entity _type is the contents or kind of objects that are to be transferred.
2.Drop Schema:
The DROP command can be used to drop named schema elements, such as tables, domains,
or constraints. One can also drop a schema. For example, if a whole schema is no longer
needed, the Drop schema is used.
• There are two drop behavior options: CASCADE and RESTRICT.
• DROP SCHEMA COMPANY CASCADE;
• If the RESTRICT option is chosen in place of CASCADE, the schema is dropped only if it
has no elements in it; otherwise, the DROP command will not be executed.
• To use the RESTRICT option, the user must first individually drop each element in the
schema, then drop the schema itself.
• The DROP TABLE command not only deletes all the records in the table if successful, but
also removes the table definition from the catalog. If it is desired to delete only the records
but to leave the table definition for future use, then the DELETE command should be used
instead of DROP TABLE.
• The DROP command can also be used to drop other types of named schema elements, such
as constraints or domains.
• For example, the following command removes the attribute Address from the
EMPLOYEE base table:
ALTER TABLE COMPANY.EMPLOYEE DROP COLUMN Address CASCADE;
• It is also possible to alter a column definition by dropping an existing default clause or by
defining a new default clause. The following examples illustrate this clause:
ALTER TABLE COMPANY.DEPARTMENT ALTER COLUMN Mgr_ssn DROP
DEFAULT;
ALTER TABLE COMPANY.DEPARTMENT ALTER COLUMN Mgr_ssn SET
DEFAULT ‘’333445555’;
CHAPTER 2:
Introduction to Transaction Processing Concepts and Theory
The concept of transaction provides a mechanism for describing logical units of database
processing. Transaction processing systems are systems with large databases and hundreds of
concurrent users executing database transactions. Examples of such systems include airline
reservations, banking, credit card processing, online retail purchasing, stock markets,
supermarket checkouts, and many other applications. Introduction to Transaction Processing
In this section we discuss the concepts of concurrent execution of transactions and recovery
from transaction failures.
Fig 21.2
• Multiple users can access databases—and use computer systems—simultaneously because of
the concept of multiprogramming, which allows the operating system of the computer to
execute multiple programs—or processes—at the same time.
• A single central processing unit (CPU) can only execute at most one process at a time.
However, multiprogramming operating systems execute some commands from one process,
then suspend that process and execute some commands from the next process, and so on.
A process is resumed at the point where it was suspended whenever it gets its turn to use the
CPU again. Hence, concurrent execution of processes is actually interleaved, as illustrated in
Figure 21.1, which shows two processes, A and B, executing concurrently in an interleaved
fashion. Interleaving keeps the CPU busy when a process requires an input or output (I/O)
operation, such as reading a block from disk. The CPU is switched to execute another process
rather than remaining idle during I/O time.
Interleaving also prevents a long process from delaying other processes. If the computer
system has multiple hardware processors (CPUs), parallel processing of multiple processes is
possible, as illustrated by processes C and D in Figure 21.1.
5.2.2 Transactions, Database items, Read & Write operations, DBMS buffers
A transaction is an executing program that forms a logical unit of database processing. It
includes one or more DB access operations such as insertion, deletion, modification, or
retrieval. Transactions can be embedded within an application program using begin
transaction and end transaction statements, or specified interactively via a high-level query
language like SQL. Read-only transactions do not update the database, while read-write
transactions do. The size of a data item is called its granularity.
Basic DB access operations include read_Item(X) to read a DB item into a program variable
and write_item(X) to write a program variable's value into the DB
item. Executing read_item(X) involves finding the disk block address, copying the block to a
buffer, and copying the item from the buffer to a program
variable. Executing write_item(X) involves finding the disk block address, copying the block
to a buffer, copying the item from the program variable to the buffer, and storing the updated
block back to disk. The recovery manager handles when to store modified blocks. A DB cache
includes data buffers, and when these are full, a buffer replacement policy, such as LRU, is
used.
The above transactions, specifically focusing on read and write operations. It illustrates two
example transactions, T1 and T2. Transaction T1 reads items X and Y, modifies them, and then
writes them back, while transaction T2 reads and modifies item X. The image also introduces
the concepts of read-sets and write-sets, explaining that a read-set comprises all items a
transaction reads, and a write-set includes all items a transaction writes. For example, in
transaction T1, both the read-set and write-set are {X, Y}.
shows a transaction T1 that transfers N reservations from one flight whose number of reserved
seats is stored in the database item named X to another flight whose number of reserved seats
is stored in the database item named Y.
b)The Temporary Update (or Dirty Read) Problem: This problem occurs when one
transaction updates a database item and then the transaction fails for some reason.
d) The Unrepeatable Read Problem: Another problem that may occur is called unrepeatable
read, where a transaction T reads the same item twice and the item is changed by another
transaction T′ between the two reads.
Why Recovery Is Needed ?
Whenever a transaction is submitted to a DBMS for execution, the system is responsible for
making sure that either all the operations in the transaction are completed successfully and their
effect is recorded permanently in the database, or that the transaction does not have any effect
on the database or any other transactions.
Types of Failures:
Failures are generally classified as transaction, system, and media failures. There are several
possible reasons for a transaction to fail in the middle of execution:
1. A computer failure (system crash).
A hardware, software, or network error occurs in the computer system during transaction
execution. Hardware crashes are usually media failures—for example, main memory failure.
2. A transaction or system error – Inability of a transaction to complete due to internals errors
or external system issues like logical errors , syntax errors ,system crash, Disk failure etc..
3. Local errors or exception conditions detected by the transaction.
• During transaction execution, certain conditions may occur that necessitate cancellation
of the transaction.
4. Concurrency control enforcement.
The concurrency control method may decide to abort the transaction, to start again because it
basically violates serializability or we can say that several processes are in a deadlock.
5.Disk failure
• Some disk blocks may lose their data because of a read or write malfunction or because of
a disk read/write head crash , Bad sectors (damaged areas of disk.
• 6. Physical problems and catastrophes.
• This refers to an endless list of problems that includes power or air-conditioning failure,
fire, theft, overwriting disks or tapes by mistake, and mounting of a wrong tape by the
operator.
• Hence, before a transaction reaches its commit point, any portion of the log that has not
been written to the disk yet must now be written to the disk.
• This process is called force-writing the log buffer to disk before committing a transaction.
❖ A transaction is said to have level 0 (zero) isolation if it does not overwrite the dirty reads
of higher-level transactions.
❖ Level 1 (one) isolation has no lost updates, and level 2 isolation has no lost updates and no
dirty reads.
❖ Finally, level 3 isolation (also called true isolation) has, in addition to level 2 properties,
repeatable reads.
THE_END: ... ;
1. The above transaction consists of first inserting a new row in the EMPLOYEE table and
then updating the salary of all employees who work in department 2.
2. If an error occurs on any of the SQL statements, the entire transaction is rolled back. This
implies that any updated salary (by this transaction) would be restored to its previ-ous value
and that the newly inserted row would be removed.
3. As we have seen, SQL provides a number of transaction-oriented features. The DBA or
database programmers can take advantage of these options to try improving