Advanced SQL Quries
Advanced SQL Quries
This Module describes more advanced features of the SQL language standard for relational
databases.
We described some basic types of retrieval queries in SQL. Because of the generality and
expressive power of the language, there are many additional features that allow users to specify
more complex retrievals from the database.
SQL has various rules for dealing with NULL values. NULL is used to represent a missing
value, but that it usually has one of three different interpretations—value unknown (exists but is
not known),value not available (exists but is purposely withheld),or value not applicable (the
attribute is undefined for this tuple).
In Tables 5.1(a) and 5.1(b),the rows and columns represent the values of the results of
comparison conditions, which would typically appear in the WHERE clause of an SQL query.
Each expression result would have a value of TRUE, FALSE, or UNKNOWN. The result of
combining the two values using the AND logical connective is shown by the entries in Table
5.1(a).Table 5.1(b) shows the result of using the OR logical connective. For example, the result
of (FALSE AND UNKNOWN) is FALSE, whereas the result of (FALSE OR UNKNOWN) is
UNKNOWN. Table 5.1(c) shows the result of the NOT logical operation. Notice that in standard
Boolean logic, only TRUE or FALSE values are permitted; there is no UNKNOWN value.
SQL allows queries that check whether an attribute value is NULL. Rather than using = or
<> to compare an attribute value to NULL, SQL uses the comparison operators IS or IS NOT.
This is because SQL considers each NULL value as being distinct from every other NULL value,
so equality comparison is not appropriate.
Query 18. Retrieve the names of all employees who do not have supervisors.
SELECT Fname, Lname
FROM EMPLOYEE WHERE Super_ ssnIS NULL;
IN operator:
Which is a comparison operator that compares a value v with a set (or multiset) of values
V and evaluates to TRUE if v is one of the elements in V.
ex: Query 17. Retrieve the Social Security numbers of all employees who work on
project numbers 1,2,or 3.
SELECT DISTINCTE ssn
FROM WORKS_ON WHERE Pno IN (1, 2, 3);
SQL allows the use of tuples of values in comparisons by placing them within
parentheses. To illustrate this, consider the following query:
SELECT DISTINCT Essn
FROM WORKS_ON WHERE (Pno, Hours) IN( SELECT Pno, Hours FROM
WORKS_ON WHERE Essn=‘123456789’);
This query will select the Essns of all employees who work the same (project, hours)
combination on some project that employee ‘John Smith’ (whose Ssn = ‘123456789’) works on.
In this example, the IN operator compares the sub tuple of values in parentheses (Pno, Hours)
within each tuple in WORKS_ON with the set of type-compatible tuples produced by the nested
query.
In addition to the IN operator, a number of other comparison operators can be used to
compare a single value v (typically an attribute name) to a set or multiset v (typically a nested
query).The = ANY (or = SOME) operator returns TRUE if the value v is equal to some value in
the set V and is hence equivalent to IN. The two keywords ANY and SOME have the same
effect. Other operators that can be combined with ANY (or SOME) include >, >=, <, <=, and <>.
The keyword ALL can also be combined with each of these operators. For example,
the comparison condition (v>ALL V) returns TRUE if the value Vis greater than all the values in
For example, we can think of Q16 as follows: For each EMPLOYEE tuple, evaluate the
nested query, which retrieves the Essn values for all DEPENDENT tuples with the same sex and
name as that EMPLOYEE tuple; if the Ssn value of the EMPLOYEE tuple is in the result of the
nested query, then select that EMPLOYEE tuple.
In general a query written with nested select-from-where blocks and using the = or IN
comparison operators can always be expressed as a single block query. For example, Q16 may
be written as in Q16A:
Q16A: SELECT E. Fname, E.Lname
FROM EMPLOYEE AS E, DEPENDENT AS D
WHERE E.Ssn=D.Essn AND E.Sex=D.Sex AND E.Fname=D.Dependent_name;
EXISTS and NOT EXISTS are typically used in conjunction with a correlated nested
query. In Q16B, the nested query references the Ssn, Fname and Sex attributes of the
EMPLOYEE relation from the outer query. We can think of Q16Bas follows: For each
EMPLOYEE tuple, evaluate the nested query, which retrieves all DEPENDENT tuples with the
same Essn, Sex, and Dependent_name as the EMPLOYEE tuple; if at least one tuple EXISTS in
the result of the nested query, then select that EMPLOYEE tuple.
In general, EXISTS (Q) returns TRUE if there is at least one tuple in the result of the
nested query Q, and it returns FALSE otherwise.
On the other hand, NOT EXISTS(Q) returns TRUE if there are no tuples in the result of
nested query Q, and it returns FALSE otherwise. Next, we illustrate the use of NOT EXISTS.
Query 6.Retrieve the names of employees who have no dependents.
SELECT Fname,Lname
FROM EMPLOYEE
WHERE NOT EXISTS (SELECT * FROM DEPENDENT
WHERE Ssn=Essn);
In Q6, the correlated nested query retrieves all DEPENDENT tuples related to a
particular EMPLOYEE tuple. If none exist, the EMPLOYEE tuple is selected because the
WHERE-clause condition will evaluate to TRUE in this case. We can explain Q6 as follows: For
each EMPLOYEE tuple, the correlated nested query selects all DEPENDENT tuples whose Essn
value matches the EMPLOYEE Ssn; if the result isempty, no dependents are related to the
employee, so we select that EMPLOYEE tuple and retrieve its Fname and Lname.
4.1.5 Explicit Sets and Renaming of Attributes in SQL
Explicit Sets
We have seen several queries with a nested query in the WHERE clause. It is also
possible to use an explicit set of values in the WHERE clause, rather than a nested query. Such a
set is enclosed in parentheses in SQL.
Query 17.Retrieve the Social Security numbers of all employees who work on
project numbers 1,2, or 3.
SELECT DISTINCT Essn FROM WORKS_ON WHERE Pno IN (1, 2, 3);
*Renaming of Attributes
In SQL, it is possible to rename any attribute that appears in the result of a query by
adding the qualifier AS followed by the desired new name. Hence, the AS construct can be used
to alias both attribute and relation names, and it can be used in both the SELECT and FROM
clauses.
For example, to retrieve the last name of each employee and his or her supervisor, while
renaming the resulting attribute names as Employee_ name and Supervisor_ name.The new
names will appear as column headers in the query result.
SELECT E. Lname AS Employee_name, S.Lname AS Supervisor_name
FROM EMPLOYEE AS E, EMPLOYEE AS S
WHERE E.Super_ssn=S.Ssn;
The default type of join in a joined table is called an inner join, where a tuple is included
in the result only if a matching tuple exists in the other relation
There are a variety of outer join operations.
1) LEFT OUTER JOIN (every tuple in the left table must appear in the result; if
it does not have a matching tuple,it is padded with NULL values for the
attributes of the right table).
2) RIGHT OUTER JOIN (every tuple in the right table must appear in the
result; if it does not have a matching tuple, it is padded with NULL values for
the attributes of the left table).
3) FULL OUTER JOIN: It is a combination of left and right outer joins.
In the latter three options, the keyword OUTER may be omitted. If the join attributes
have the same name, one can also specify the natural join variation of outer joins by
using the keyword NATURAL before the operation (for example, NATURAL LEFT
OUTER JOIN).
The keyword CROSS JOIN is used to specify the CARTESIAN PRODUCT operation
although this should be used only with the utmost care because it generates all possible
tuple combinations.
It is also possible to nest join specifications; that is, one of the tables in a join may itself
be a joined table. This allows the specification of the join of three or more tables as a single
joined table, which is called a multiway join.
Not all SQL implementations have implemented the new syntax of joined tables. In some
systems, a different syntax was used to specify outer joins by using the comparison operators +=,
=+, and +=+ for left, right, and full outer join, respectively, when specifying the join condition.
For example, this syntax is available in Oracle. To specify the left outer join using this syntax,
we could write the query as follows:
SELECT E.Lname, S.Lname
FROM EMPLOYEE E, EMPLOYEE S
WHERE E.Super_ssn +=S.Ssn;
4.1.7 Aggregate Functions in SQL
Aggregate functions are used to summarize information from multiple tuples into a single-tuple
summary. Grouping is used to create subgroups of tuples before summarization. Grouping and
aggregation are required in many database applications, and we will introduce their use in SQL
through examples.
A number of built-in aggregate functions exist: COUNT, SUM, MAX, MIN, and AVG.
The COUNT function returns the number of tuples or values as specified in a query. The
functions SUM, MAX, MIN, and AVG can be applied to a set or multiset of numeric values and
return, respectively, the sum, maximum value, minimum value, and average (mean) of those
values.
Query 19. Find the sum of the salaries of all employees, the maximum salary, the
minimum salary, and the average salary.
SELECT SUM (Salary), MAX (Salary), MIN (Salary), AVG (Salary)
FROM EMPLOYEE;
If we want to get the preceding function values for employees of a specific department—say, the
‘Research’ department—we can write Query 20, where the EMPLOYEE tuples are restricted by
the WHERE clause to those employees who work for the ‘Research’ department.
Query 20. Find the sum of the salaries of all employees of the ‘Research’
department, as well as the maximum salary, the minimum salary, and the average salary in
this department.
SELECT SUM (Salary), MAX (Salary), MIN (Salary), AVG (Salary)
FROM (EMPLOYEE JOIN DEPARTMENT ON Dno=Dnumber)
WHERE Dname=‘Research’;
Queries 21 and 22. Retrieve the total number of employees in the company (Q21) and the
number of employees in the ‘Research’ department (Q22).
will
not be counted. In general, NULL values are discarded when aggregate functions are applied to a particular
column (attribute).
some attribute(s), called the grouping attribute(s). We can then apply the function to each such
group independently to produce summary information about each group.
HAVING clause
SQL provides a HAVING clause, which can appear in conjunction with a GROUP BY clause.
HAVING provides a condition on the summary information regarding the group of tuples
associated with each value of the grouping attributes. Only the groups that satisfy the condition
are retrieved in the result of the query. This is illustrated by Query 26.
In order to formulate queries correctly, it is useful to consider the steps that define the meaning
or semantics of each query. A query is evaluated conceptually by first applying the FROM
clause (to identify all tables involved in the query or to materialize any joined tables), followed
by the WHERE clause to select and join tuples, and then by GROUP BY and HAVING.
Conceptually, ORDER BY is applied at the end to sort the query result.
CREATE ASSERTION, which can be used to specify additional types of constraints that
are outside the scope of the built-in relational model constraints (primary and unique keys, entity
integrity, and referential integrity) that we presented early.
CREATE TRIGGER, which can be used to specify automatic actions that the database
system will perform when certain events and conditions occur. This type of functionality is
generally referred to as active databases.
The basic technique for writing such assertions is to specify a query that selects any tuples that
violate the desired condition. By including this query inside a NOT EXISTS clause, the assertion
will specify that the result of this query must be empty so that the condition will always be
TRUE. Thus, the assertion is violated if the result of the query is not empty. In the preceding
example, the query selects all employees whose salaries are greater than the salary of the
manager of their department. If the result of the query is not empty, the assertion is violated.
Trigger: Condition
Trigger: Action
Example 1
Example 2
Query Modification
Modifying the view query into a query on the underlying base tables
Disadvantage: inefficient for views defined via complex queries that are time-consuming
to execute, especially if multiple queries are applied to the view within a short period of
time.
Example
The query example# would be automatically modified to the following query by the
DBMS
SELECT Fname, Lname
FROM EMPLOYEE, PROJECT, WORKS_ON
WHERE Ssn=Essn ANDPno=Pnumber
AND Pname=”ProductX’;
View Materialization
Physically create a temporary view table when the view is first queried
Keep that table on the assumption that other queries on the view will follow
Requires efficient strategy for automatically updating the view table when the base tables
are updated, that is Incremental Update
Incremental Update determines what new tuples must be inserted, deleted, or modified
in a materialized view table when a change is applied to one of the defining base table
View Update
Updating of views is complicated and can be ambiguous
An update on view defined on a single table without any aggregate functions can be
mapped to an update on the underlying base table under certain conditions.
View involving joins, an update operation may be mapped to update operations on the
underlying base relations in multiple ways.
Example
Update the Pname attribute of ‘john smith’ from ‘ProductX’ to ‘ProductY’
UPDATE WORKS_ON1
SET Pname= ‘ProductY’
WHERE Lname=‘smith’ ANDFname=‘john’
AND Pname= ‘ProductX’
This query can be mapped into several updates on the base relations to give the desired
effect on the view.
Two possible updates (a) and (b) on the base relations corresponding to above query
.
(a) UPDATE WORKS_ON
SET Pno= (SELECT Pnumber
FROM PROJECT
WHERE Pname= ‘ProductY’)
WHERE Essn IN (SELECT Ssn
FROM E M P L O Y E E
WHERE Lname=‘smith’ AND Fname=‘john’)
AND
Pno= (SELECT Pnumber
FROM PROJECT
WHERE
Pname=‘ProductX’ );
OBSERVATIONS ON VIEWS
A view with a single defining table is updatable if the view attributes contain the primary
key of the base relation, as well as all attributes with the NOT NULL constraint that do
not have default values specified
Views defined on multiple tables using joins are generally not updatable
Views defined using grouping and aggregate functions are not updatable
In SQL, the clause WITH HECK OPTION must be added at the end of the view
definition if a view is to be updated.
Advantages of Views
Data independence
Currency
Improved security
Reduced complexity
Convenience
Customization
Data integrity