SQL (2)
Mata Ajar Basis Data 1
Tujuan Pemelajaran
Setelah mengikuti pemelajaran pada topik ini,
jika diberikan skema lojikal basis data, Anda
diharapkan dapat mengimplementasikan DDL
dan DML pada salah satu DBMS yang
populer, termasuk dapat memutuskan tipe
data yang tepat untuk setiap field dan
contraints yang sesuai untuk setiap relasi.
Content Development GDLN Batch 2 2
Outline
Tables as Sets in SQL
Substring pattern matching
Arithmatic operation
NULL values in SQL
Nested queries
EXISTS FUNCTION
EXPLICIT SET & RENAMING ATTRIBUTE
JOIN
AGGREGATE FUNCTION
GROUPING & HAVING CLAUSE
Content Development GDLN Batch 2 3
Set Operations
SQL has directly incorporated some set operations
There is a union operation (UNION), and in some
versions of SQL there are set difference (MINUS) and
intersection (INTERSECT) operations
The resulting relations of these set operations are sets
of tuples; duplicate tuples are eliminated from the
result
The set operations apply only to union compatible
relations ; the two relations must have the same
attributes and the attributes must appear in the same
order
Content Development GDLN Batch 2 4
Set Operations (cont.)
Query 4: Make a list of all project numbers for projects that involve an
employee whose last name is 'Smith' as a worker or as a manager of
the department that controls the project.
Q4: (SELECT PNUMBER
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE DNUM=DNUMBER AND MGRSSN=SSN
AND LNAME='Smith')
UNION
(SELECT PNUMBER
FROM PROJECT, WORKS_ON, EMPLOYEE
WHERE PNUMBER=PNO AND ESSN=SSN AND
LNAME='Smith')
Content Development GDLN Batch 2 5
Substring Comparison
The LIKE comparison operator is used
to compare partial strings
Two reserved characters are used: '%'
(or '*' in some implementations)
replaces an arbitrary number of
characters, and '_' replaces a single
arbitrary character
Content Development GDLN Batch 2 6
Substring Comparison (cont.)
Query 12: Retrieve all employees whose address is
in Houston, Texas. Here, the value of the ADDRESS
attribute must contain the substring 'Houston,TX'.
Q12:SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE ADDRESS LIKE '%Houston, TX%’
Content Development GDLN Batch 2 7
Substring Comparison (cont.)
Query 12A: Retrieve all employees who were born during the
1950s. Here, '5' must be the 8th character of the string (according
to our format for date), so the BDATE value is '_______5_', with
each underscore as a place holder for a single arbitrary character.
Q12A: SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE BDATE LIKE '_______5_’
If underscore or % is needed as a literal character in the string, the
character should be preceded by an escape character (‘\’).
‘AB\_CD\%EF’ is represent ‘AB_CD%EF’
Content Development GDLN Batch 2 8
Arithmetic Operations
The standard arithmetic operators '+', '-'. '*', and '/' (for addition,
subtraction, multiplication, and division, respectively) can be applied
to numeric values in an SQL query result
Query 13: Show the effect of giving all employees who work on the
'ProductX' project a 10% raise.
Q13:SELECT FNAME, LNAME, 1.1*SALARY
AS INCREASED_SAL
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE SSN=ESSN AND PNO=PNUMBER AND
PNAME='ProductX’
Content Development GDLN Batch 2 9
Arithmetic Operations (2)
Query 14: Retrieve all employees in department 5 whose salary is
between $30,000 and $40,000
Q14:SELECT *
FROM EMPLOYEE
WHERE (SALARY BETWEEN 30000 AND 40000)
AND DNO=5;
Q14A:SELECT *
FROM EMPLOYEE
WHERE (SALARY >= 30000 AND SALARY <=40000)
AND DNO=5;
Content Development GDLN Batch 2 10
Order By
The ORDER BY clause is used to sort the tuples
in a query result based on the values of some
attribute(s)
Query 15: Retrieve a list of employees and the
projects each works in, ordered by the
employee's department, and within each
department ordered alphabetically by employee
last name.
Q15: SELECT DNAME, LNAME, FNAME, PNAME
FROM DEPARTMENT, EMPLOYEE,
WORKS_ON, PROJECT
WHERE DNUMBER=DNO AND SSN=ESSN
AND PNO=PNUMBER
ORDER BY DNAME, LNAME
Content Development GDLN Batch 2 11
Order By (cont.)
The default order is in ascending order of values
We can specify the keyword DESC if we want a
descending order; the keyword ASC can be used
to explicitly specify ascending order, even though
it is the default
Content Development GDLN Batch 2 12
Nulls in SQL Queries
SQL allows queries that check if a value is NULL (missing or
undefined or not applicable)
SQL uses IS or IS NOT to compare NULLs because it considers
each NULL value distinct from other NULL values, so equality
comparison is not appropriate .
Query 18: Retrieve the names of all employees who do not have
supervisors.
Q18: SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE SUPERSSN IS NULL
Note: If a join condition is specified, tuples with NULL values for
the join attributes are not included in the result
Content Development GDLN Batch 2 13
Nesting of Queries
Some queries require that existing values in the database be
fetched and then used in a comparison condition using
nested query
A nested query is a complete SELECT-FROM-WHERE block
within in the WHERE-clause of another query
That other query is called the outer query
Query 1A: Retrieve the name and address of all employees who
work for the 'Research' department.
Q1A: SELECT FNAME, LNAME, ADDRESS Outer Query
FROM EMPLOYEE
WHERE DNO IN
(SELECT DNUMBER
FROM DEPARTMENT
Nested Query WHERE DNAME='Research' )
Content Development GDLN Batch 2 14
Nesting of Queries (cont.)
The nested query selects the number of the
'Research' department
The outer query select an EMPLOYEE tuple if
its DNO value is in the result of either nested
query
The comparison operator IN compares a value
v with a set (or multi-set) of values V, and
evaluates to TRUE if v is one of the elements
in V
In general, we can have several levels of
nested queries
Content Development GDLN Batch 2 15
Nesting of Queries (cont.)
SQL allows the use of tuples of values in comparisons
by placing them within parentheses
Query: retrieve the SSN from all employees who work
the same (project,hours) combination on same project
that employee ‘Jhon Smith’ (ESSN = ‘123456789’
works on.
SELECT DISTINCT ESSN
FROM WORKS_ON
WHERE (PNO, HOURS) IN
(SELECT PNO,
HOURS FROM WORKS_ON
WHERE ESSN = ‘123456789’);
Content Development GDLN Batch 2 16
Nesting of Queries (cont.)
Comparison operator can be used in nested query: >,
>=, <, <=, <>
Keyword ALL can be used
(v > ALL V) returns TRUE if the value v is greater
than all the values in the set (or multiset) V.
Query: Return the names of employees whose salary
is greater than salary of all the employees in
department 5.
SELECT LNAME, FNAME
FROM EMPLOYEE
WHERE SALARY > ALL (SELECT SALARY FROM
EMPLOYEE WHERE DNO=5)
Content Development GDLN Batch 2 17
Correlated Nested Queries
If a condition in the WHERE-clause of a nested query references an
attribute of a relation declared in the outer query , the two queries are
said to be correlated
The result of a correlated nested query is different for each tuple (or
combination of tuples) of the relation(s) the outer query
Query 16: Retrieve the name of each employee who has a dependent
with the same first name and same sex as the employee.
Q16: SELECT E.FNAME, E.LNAME
FROM EMPLOYEE AS E
WHERE SSN IN (SELECT ESSN
FROM DEPENDENT
WHERE FNAME=DEPENDENT_NAME
AND E.SEX = SEX)
Refer to sex attribute
in outer query
(EMPLOYEE)
Content Development GDLN Batch 2 18
Nesting of Queries (cont.)
A query written with nested SELECT... FROM... WHERE... blocks and
using the = or IN comparison operators can always be expressed as a
single block query. For example, Q12 may be written as in Q12A
Q12A: SELECT E.FNAME, E.LNAME
FROM EMPLOYEE E, DEPENDENT D
WHERE E.SSN = D.ESSN AND
E.FNAME = D.DEPENDENT_NAME
AND E.SEX = D.SEX
The original SQL as specified for SYSTEM R also had a CONTAINS
comparison operator, which is used in conjunction with nested correlated
queries
This operator was dropped from the language, possibly because of the
difficulty in implementing it efficiently
Content Development GDLN Batch 2 19
Nesting of Queries (cont.)
Most implementations of SQL do not have this operator
The CONTAINS operator compares two sets of values , and returns
TRUE if one set contains all values in the other set
(reminiscent of the division operation of algebra).
• Query 3: Retrieve the name of each employee who works on all the
projects controlled by department number 5.
Q3: SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE ( (SELECT PNO
FROM WORKS_ON
WHERE SSN=ESSN)
CONTAINS
(SELECT PNUMBER
FROM PROJECT
WHERE DNUM=5) )
Content Development GDLN Batch 2 20
Nesting of Queries (cont.)
In Q3, the second nested query, which is not
correlated with the outer query, retrieves the
project numbers of all projects controlled by
department 5
The first nested query, which is correlated,
retrieves the project numbers on which the
employee works, which is different for each
employee tuple because of the correlation
Content Development GDLN Batch 2 21
The EXISTS Function
EXISTS is used to check whether the result of
a correlated nested query is empty (contains
no tuples) or not
We can formulate Query 12 in an alternative
form that uses EXISTS as Q12B below
EXISTS AND NOT EXISTS are usually used
in conjunction with a correlated nested query
Content Development GDLN Batch 2 22
The EXISTS Function
Query 12: Retrieve the name of each employee who has
a dependent with the same first name and same sex as
the employee.
Q12B: SELECT FNAME, LNAME
FROM EMPLOYEE E
WHERE EXISTS
(SELECT *
FROM DEPENDENT
WHERE SSN=ESSN AND
FNAME=DEPENDENT_NAME
AND E.SEX = SEX
)
Content Development GDLN Batch 2 23
The EXISTS Function
Query 6: Retrieve the names of employees who have no
dependents.
Q6: SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE NOT EXISTS (SELECT *
FROM DEPENDENT
WHERE SSN=ESSN)
In Q6, the correlated nested query retrieves all DEPENDENT tuples
related to an EMPLOYEE tuple. If none exist , the EMPLOYEE tuple is
selected
Content Development GDLN Batch 2 24
The EXISTS Function
Query 7:List the names of managers who have at least one dependent.
SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE
EXISTS (SELECT * FROM DEPENDENT WHERE SSN=ESSN)
AND
EXISTS (SELECT * FROM DEPARTMENT WHERE SSN = MGRSSN);
The first nested query select all DEPENDENT tuples related to an EMPLOYEE
The second nested query select all DEPARTMENT tuples managed by the
EMPLOYEE
If at least one of the first and at least one of the second exists, we select the
EMPLOYEE tuple.
Can you rewrite that query using only one nested query or no
nested query ?
Content Development GDLN Batch 2 25
The EXISTS Function
Query 3: Retrieve the name of each employee who works on all the
projects controlled by department number 5
Can be used: (S1 CONTAINS S2) that logically equivalent to (S2 EXCEPT
S1) is empty.
SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE NOT EXISTS
( (SELECT PNUMBER FROM PROJECT WHERE DNUM=5)
EXCEPT
(SELECT PNO FROM WORKS_ON WHERE SSN = ESSN));
The first subquery select all projects controlled by dept 5
The second subquery select all projects that particular employee being
considered works on.
If the set difference of the first subquery MINUS (EXCEPT) the second
subquery is empty, it means that the employee works on all the projects and is
hence selected
Content Development GDLN Batch 2 26
Explicit Sets
It is also possible to use an explicit (enumerated)
set of values in the WHERE-clause rather than a
nested query
Query 17: Retrieve the social security numbers of all
employees who work on project number 1, 2, or 3.
Q17: SELECT DISTINCT ESSN
FROM WORKS_ON
WHERE PNO IN (1, 2, 3)
Content Development GDLN Batch 2 27
Renaming Attribute
In SQL, its possible to rename attribute that
appears in the result of a query by adding the
qualifier AS followed by the desired new
name.
Q8A: SELECT E.LNAME AS EMPLOYEE_NAME,
S.LNAME AS SUPERVISOR_NAME
FROM EMPLOYEE E, EMPLOYEE S
WHERE E.SUPERSSN = S.SSN;
Content Development GDLN Batch 2 28
Joined Relation Feature in SQL-99
Can specify a "joined relation" in the FROM-clause
Looks like any other relation but is the result of a join
Allows the user to specify different types of joins
(regular "theta" JOIN, NATURAL JOIN, LEFT OUTER
JOIN, RIGHT OUTER JOIN, CROSS JOIN, etc)
Content Development GDLN Batch 2 29
Example: CROSS JOIN
Foods SELECT * FROM Foods
name cafe CROSS JOIN Likes
Food 1 XYZ name cafe Person Food
Food 2 ABC Food 1 XYZ Narpati Food 1
Food 1 XYZ Nizar Food 1
Food 3 ABC
Food 1 XYZ Danu Food 3
Food 2 ABC Narpati Food 1
Likes Food 2 ABC Nizar Food 1
Person Food
Food 2 ABC Danu Food 3
Narpati Food 1 Food 3 ABC Narpati Food 1
Nizar Food 1 Food 3 ABC Nizar Food 1
Danu Food 3 Food 3 ABC Danu Food 3
Content Development GDLN Batch 2 30
Example: NATURAL JOIN
Likes
Person Food
Narpati Food 1
Nizar Food 1 SELECT * FROM Frequents
Danu Food 3 NATURAL JOIN Likes
Harith Food 2 Person Food cafe
Nizar Food 1 ABC
Frequents Danu Food 3 XYZ
Person cafe
Avi ABC
Danu XYZ
Nizar ABC
Jack Zanz
Content Development GDLN Batch 2 31
Example: TETHA JOIN
Foods
name cafe
Food 1 XYZ
SELECT * FROM Foods B JOIN
Food 2 ABC Likes L ON B.name = L.Food
Food 3 ABC name cafe Person Food
Food 1 XYZ Narpati Food 1
Food 1 XYZ Nizar Food 1
Likes
Food 3 ABC Danu Food 3
Person Food
Narpati Food 1
Nizar Food 1
Danu Food 3
Content Development GDLN Batch 2 32
Example: OUTER JOIN
Foods SELECT * FROM Foods B LEFT OUTER
JOIN Likes L ON B.name = L.Food
name cafe name cafe Person Food
Food 1 XYZ Narpati Food 1
Food 1 XYZ
Food 1 XYZ Nizar Food 1
Food 2 ABC Food 2 ABC
Food 3 ABC Food 3 ABC Danu Food 3
SELECT * FROM Foods B RIGHT OUTER
Likes JOIN Likes L ON B.name = L.Food
Person Food name cafe Person Food
Narpati Food 1 Food 1 XYZ Narpati Food 1
Food 1 XYZ Nizar Food 1
Nizar Food 1
Food 3 ABC Danu Food 3
Danu Food 3
Avi Food 5
Avi Food 5
Content Development GDLN Batch 2 33
Example: FULL OUTER JOIN
Foods
name cafe
Food 1 XYZ SELECT * FROM Foods B FULL OUTER
JOIN Likes L ON B.name = L.Food
Food 2 ABC
name cafe Person Food
Food 3 ABC
Food 1 XYZ Narpati Food 1
Likes Food 1 XYZ Nizar Food 1
Person Food Food 2 ABC
Narpati Food 1 Food 3 ABC Danu Food 3
Nizar Food 1 Avi Food 5
Danu Food 3
Avi Food 5
Content Development GDLN Batch 2 34
Aggregate Function
Include COUNT, SUM, MAX, MIN, and AVG
Query : Find the maximum salary, the minimum salary, and the
average salary among all employees.
SELECT MAX(SALARY),
MIN(SALARY), AVG(SALARY)
FROM EMPLOYEE
Some SQL implementations may not allow more than one
function in the SELECT-clause
Content Development GDLN Batch 2 35
Aggregate Function
Query : Find the maximum salary, the minimum
salary, and the average salary among employees
who work for the 'Research' department.
SELECT MAX(SALARY), MIN(SALARY),
AVG(SALARY)
FROM EMPLOYEE, DEPARTMENT
WHERE DNO=DNUMBER AND
DNAME='Research'
Content Development GDLN Batch 2 36
Aggregate Function
Queries : Retrieve the total number of employees in the
company (QA), and the number of employees in the 'Research'
department (QB).
QA: SELECT COUNT (*)
FROM EMPLOYEE
QB: SELECT COUNT (*)
FROM EMPLOYEE, DEPARTMENT
WHERE DNO=DNUMBER AND
DNAME='Research’
Content Development GDLN Batch 2 37
Grouping
In many cases, we want to apply the aggregate
functions to subgroups of tuples in a relation
Each subgroup of tuples consists of the set of tuples
that have the same value for the grouping
attribute(s)
The function is applied to each subgroup
independently
SQL has a GROUP BY-clause for specifying the
grouping attributes, which must also appear in the
SELECT-clause
Content Development GDLN Batch 2 38
Grouping (cont)
Query 24: For each department, retrieve the department number, the
number of employees in the department, and their average salary.
Q24:SELECT DNO, COUNT (*), AVG (SALARY)
FROM EMPLOYEE
GROUP BY DNO
In Q24, the EMPLOYEE tuples are divided into groups--each group
having the same value for the grouping attribute DNO
The COUNT and AVG functions are applied to each such group of
tuples separately
The SELECT-clause includes only the grouping attribute and the
functions to be applied on each group of tuples
A join condition can be used in conjunction with grouping
Content Development GDLN Batch 2 39
Grouping
Query 25: For each project, retrieve the project number, project
name, and the number of employees who work on that project.
Q25:SELECT PNUMBER, PNAME, COUNT (*)
FROM PROJECT, WORKS_ON
WHERE PNUMBER=PNO
GROUP BY PNUMBER, PNAME
In this case, the grouping and functions are applied after the joining of
the two relations
Content Development GDLN Batch 2 40
The HAVING Clause
Sometimes we want to retrieve the
values of these functions for only those
groups that satisfy certain conditions
The HAVING-clause is used for
specifying a selection condition on
groups (rather than on individual tuples)
Content Development GDLN Batch 2 41
The HAVING Clause
Query 26: For each project on which more than two
employees work , retrieve the project number, project
name, and the number of employees who work on that
project.
Q26: SELECT PNUMBER, PNAME, COUNT (*)
FROM PROJECT, WORKS_ON
WHERE PNUMBER=PNO
GROUP BY PNUMBER, PNAME
HAVING COUNT (*) > 2
Content Development GDLN Batch 2 42
Summary of SQL Queries
A query in SQL can consist of up to six clauses, but
only the first two, SELECT and FROM, are
mandatory. The clauses are specified in the following
order:
SELECT <attribute list>
FROM <table list>
[WHERE <condition>]
[GROUP BY <grouping attribute(s)>]
[HAVING <group condition>]
[ORDER BY <attribute list>]
Content Development GDLN Batch 2 43
Summary of SQL Queries (cont.)
The SELECT-clause lists the attributes or functions to be
retrieved
The FROM-clause specifies all relations (or aliases) needed in
the query but not those needed in nested queries
The WHERE-clause specifies the conditions for selection and
join of tuples from the relations specified in the FROM-clause
GROUP BY specifies grouping attributes
HAVING specifies a condition for selection of groups
ORDER BY specifies an order for displaying the result of a
query
A query is evaluated by first applying the WHERE-clause, then
GROUP BY and HAVING, and finally the SELECT-clause
Content Development GDLN Batch 2 44