SQL
SQL
Chapter 5
SQL:1999
http://www.cs.ubc.ca/~ramesh/cpsc304
De Facto Standard
SQL:1999 - Current standard
Core SQL : Collection of features of SQL:1999 - A vendor MUST
implement core SQL features to claim compliance with the SQL:1999
standard.
Also called as ”SEQUEL”
Other Aspects:
• Client-server execution and Remote DB access, Transaction Management
• Security, Advanced Features (OO, Recursive queries, XML, ...)
This strategy may not be the most efficient way to compute the
query. An optimizer will find more efficient strategies to compute the
same answers
Example
• Instance S3 of Sailors: Instance R2 of Reserves:
Query Q1:
sname age
sname age
Dustin 45.0
Dustin 45.0
Brutus 33.0
Brutus 33.0
Lubber 55.5
Lubber 55.5
Andy 25.5
Andy 25.5
Rusty 35.0
Rusty 35.0
Horatio 35.0
Horatio 35.0
Zorba 16.0
Zorba 16.0
Horatio 35.0
Art 25.5
Art 25.5
Bob 63.5
Bob 63.5
SELECT R.sid
FROM Boats B, Reserves R
WHERE B.bid = R.bid AND B.color = ’Red’
SELECT S.sname
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ’Red’
More Examples
SELECT B.color
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND S.sname = ’Lubber’
• In general, there may be more than one sailor called ’Lubber’ (since sname is not a key)
• Find the names of Sailors who have reserved at least one boat
SELECT S.sname
FROM Sailors S, Reserves R
WHERE S.sid = R.sid
• Join of Sailors and Reserves ensures that for each sname, the sailor has made some reser-
vation.
EXAMPLES
• Compute increments for the ratings of persons who have sailed two different boats on the
same day
• Comparison operators like =, <, > etc. can be used for string comparison, ordering is
alphabetical.
String Operations
• For sorting strings other than alphabetical order (e.g. Month names in calendar order),
SQL supports the general concept of collation or sort order for a character set.
• Allows users to specify which characters are less than which others
• SQL also supports pattern matching through the LIKE operator and wild card symbols
. symbol % - stands for zero or more arbitrary characters
. symbol - stands for exactly one arbitrary character
. Blanks can be significant for the LIKE operator
• Find the ages of sailors whose name begins and ends with B and has at least three
characters
SELECT S.age
FROM Sailors S
WHERE S.sname LIKE ’B_%B’
• SQL:1999 more powerful version of LIKE called SIMILAR - allows regular expressions in
text search
RA and SQL
Set operations of SQL are available in RA except that they are multiset oper-
ations in SQL.
EXAMPLES
• Find names of sailors who have reserved a red boat or a green boat
SELECT S.sname
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid
AND (B.color = ’Red’ OR B.color = ’Green’)
• Find the names of sailors who have reserved both a red boat and a green boat: Replacing
OR in the previous query with AND will try to retrieve boats that are both red and green
in color and will always return an empty answer.
SELECT S.sname
FROM Sailors S, Reserves R1, Boats B1, Reserves R2, Boats B2
WHERE S.sid = R1.sid AND R1.bid = B1.bid AND
S.sid = R2.sid AND R2.bid = B2.bid AND
B1.color = ’Red’ AND B2.color = ’Green’
• Both the queries above can be expressed using UNION and INTERSECT
SELECT S.sname
FROM Sailors S, Reserves R, Boats B,
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ’Red’
UNION
SELECT S2.sname
FROM Sailors S2, Reserves R2, Boats B2,
WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND B2.color = ’Green’
SELECT S.sname
FROM Sailors S, Reserves R, Boats B,
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ’Red’
INTERSECT
SELECT S2.sname
FROM Sailors S2, Reserves R2, Boats B2,
WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND B2.color = ’Green’
• If two sailors have the name hook, one has reserved a red boat and the other has reserved
a green boat, then this name will be returned even though the same hook has not reserved
both a red and a green boat.
• The CAUSE of this bug is because sname is not a key for Sailors and we are using it to
identify sailors.
Examples - Difference
• Find the sids of all sailors who have reserved red boats but not green boats
SELECT S.sid
FROM Sailors S, Reserves R, Boats B,
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ’Red’
EXCEPT
SELECT S2.sid
FROM Sailors S2, Reserves R2, Boats B2,
WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND B2.color = ’Green’
SELECT R.sid
FROM Reserves R, Boats B,
WHERE R.bid = B.bid AND B.color = ’Red’
EXCEPT
SELECT R2.sid
FROM Reserves R2, Boats B2,
WHERE R2.bid = B2.bid AND B2.color = ’Green’
• The query assumes referential integrity - There are no reservations for nonexisting sailors
Examples
• Find all sids of sailors who have a rating of 10 or reserved boat 104
SELECT S.sid
FROM Sailors S
WHERE S.rating = 10
UNION
SELECT R.sid
FROM Reserves R
WHERE R.bid = 104
NESTED QUERIES
• The embedded query is called a subquery (which can be a nested query too)
• Nesting of queries is not available in Relational Algebra but nested queries can be trans-
lated to Algebra queries
Example
SELECT S.sname
FROM Sailors S
WHERE S.sid IN (SELECT R.sid
FROM Reserves R
WHERE R.bid = 103)
SELECT S.sname
FROM Sailors S
WHERE S.sid IN (SELECT R.sid
FROM Reserves R
WHERE R.bid IN (SELECT B.bid
FROM Boats B
WHERE B.color = ’Red’) )
Example
• Find the names of sailors who have NOT reserved a red boat
SELECT S.sname
FROM Sailors S
WHERE S.sid NOT IN (SELECT R.sid
FROM Reserves R
WHERE R.bid IN (SELECT B.bid
FROM Boats B
WHERE B.color = ’Red’) )
• Note: In all our nested queries thus far, the inner subquery has been completely indepen-
dent of the outer query
• In general, the inner subquery could depend on the row being examined in the outer
query ⇒ Correlated Nested Queries
• Find the names of sailors who have reserved boat number 103
SELECT S.sname
FROM Sailors S
WHERE EXISTS ( SELECT *
FROM Reserves R
WHERE R.bid = 103
AND R.sid = S.sid )
• EXISTS - another comparison operator such as IN. Allows us to test whether a set is
nonempty (an implicit comparison with the null set)
• Occurrence of S in the subquery in the form of S.sid is called a correlation and this is an
example of a correlated nested query
• Query also illustrates the use of ∗ in situations where existence of rows is tested rather
than retrieving some columns in a row (Other use is in aggregation)
• Using NOT EXISTS, we can compute the names of sailors who have not reserved a red
boat
• Applying UNIQUE to a subquery, the result is true if no row appears twice in the answer
to the subquery (If there are no duplicates)
Set-Comparison Operators
• EXISTS, IN, UNIQUE and their negated versions
• op ANY and op ALL are two more operators where op is one of the arithmetic comparison
operators {<, >, =, ≤, ≥, <>}
• Find sailors whose rating is better than some (every) sailor called Horatio
SELECT S.sid
FROM Sailors S
WHERE S.rating > ANY(ALL) (SELECT S2.rating
FROM Sailors S2
WHERE S2.name = ’Horatio’ )
SELECT S.sid
FROM Sailors S
WHERE S.rating >= ALL (SELECT S2.rating
FROM Sailors S2)
• Find the names of sailors who have reserved a red boat and a green boat
SELECT S.sname
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ’Red’
AND S.sid IN ( SELECT S2.sid
FROM Sailors S2, Reserves R2, Boats B2
WHERE S2.sid = R2.sid AND R2.bid = B2.bid
AND B2.color = ’Green’ )
Example of DIVISION
• Find the names of sailors who have reserved all boats
SELECT S.sname
FROM Sailors S
WHERE NOT EXISTS (( SELECT B.bid
FROM Boats B )
EXCEPT
(SELECT R.bid
FROM Reserves R
WHERE R.sid = S.sid ))
• The query is correlated. Without using EXCEPT, the query can be written as
SELECT S.sname
FROM Sailors S
WHERE NOT EXISTS (( SELECT B.bid
FROM Boats B
WHERE NOT EXISTS (SELECT R.bid
FROM Reserves R
WHERE R.bid = B.bid
AND R.sid = S.sid ))
AGGREGATE OPERATORS
• SQL supports five aggregate operators that can be applied on any column say A of a
relation
. COUNT ([DISTINCT ]A) - The number of (unique) values in column A
. SUM([DISTINCT ]A) - The sum of all (unique) values in column A
. AVG([DISTINCT ]A) - The Average of all (unique) values in column A
. MAX(A) - The maximum value in the A column
. MIN(A) - The minimum value in the A column
EXAMPLES
• Find the average age of all sailors
• Find the name and age of the oldest sailor - An illegal query is given below - The intent
is to return the name associated with the sailor with the maximum age
• If a SELECT clause uses aggregate operations, then it must use ONLY aggregate op-
erations unless there is GROUP BY - This means that we cannot use S.sname and
MAX(S.age) in the SELECT clause.
Examples Continued
• Observation: The following query is legal in the SQL standard but is unfortunately not
supported in many systems
• Aggregate operators provide an alternative to ANY and ALL - Find the names of sailors
who are older than the oldest sailor with a rating of 10.
• Aggregate operators thus far have been used on all qualifying rows. Often these operators
need to be applied on groups of rows
• Example: Find the age of the youngest sailor for each rating level - If we know the rating
values are in the range 1 through 10, we write 10 queries of the form
• Major extension to SQL - GROUP BY and HAVING to specify qualification over groups
• Find the age of the youngest sailor for each rating level
• Find the age of the youngest sailor who is at least 18 years for each rating level with at
least two such sailors
Evaluation Example
Instance S4 of Sailors:
Evaluation Steps
Step 2: Apply qualification in the WHERE clause: (S.age > 18)
Evaluation Steps
Step 3: Eliminate unwanted columns - Retain only columns mentioned in the
SELECT, GROUP BY or HAVING clause ⇒ eliminate sid and sname
Evaluation Example
Step 5: Apply group-qualification in the HAVING clause - COUNT (∗) > 1
• If WHERE is NOT applied first, then rating 10 would have met the group-qualification
rating minage
3 25.5
7 35.0
8 25.5
Examples
• Example:
• In Step 5 of evaluation: Every row in a group must satisfy the condition S.age <= 60 to
meet the group-qualification. Group 3 is hence dropped.
• The answer is
rating minage
7 35.0
8 25.5
A Slight Variation
• Step 3 of the evaluation is altered now. Row with age 63.5 no longer exists.
rating minage
3 25.5
7 35.0
8 25.5
• The group-qualification B.color = ’red’ is single-valued per group. bid is a key for Boats
and therefore determines color. Therefore SQL disallows this query
• ONLY columns that appear in the GROUP BY clause can appear in the HAVING clause
(unless they appear as arguments to an aggregate operator in the HAVING clause).
• In SQL:1999 this query can be rewritten easily using EVERY in the HAVING clause.
More Examples
• Find the average age of sailors for each rating level that has at least two sailors.
• An alternate way of writing the above query shows that the HAVING clause can have a
nested subquery
More Examples
• Find the average age of Sailors who are at least 18 years of age for each rating level that
has at least two sailors
• Find the average age of Sailors who are at least 18 years of age for each rating level that
has at least two such sailors
More Examples
• Find the average age of Sailors who are at least 18 years of age for each rating level that
has at least two such sailors - Alternate Formulations
• WHERE clause is applied before GROUP BY - we can take advantage of this fact.
• Another formulation:
Discussion
More Examples
Aggregate operations cannot be nested
• Find those ratings for which the average age of sailors is the minimum over all ratings
SELECT S.rating
FROM Sailors S
WHERE AVG (S.age) = ( SELECT MIN (AVG (S2.age))
FROM Sailors S2
GROUP BY S2.rating )
• Will not work even if MIN(AVG(S2.age)) were allowed - MIN(AVG) will return the same
value. Idea is to use temporary tables.
NULL VALUES
• SQL provides a special column value called null to use in such situations.
• Consider an example: Suppose Dan is a sailor who is new and hence does not have a
Rating value. We can insert the tuple with a null value as h98, Dan, null, 39i
• Consider a comparison Rating = 8. If this comparison is applied to the tuple where the
Rating value is null, is it true or false?
• How about comparing two null values using <, >, = etc.? The answer is unknown.
• SQL provides a special comparison operator IS NULL to test whether a column value is
null (likewise IS NOT NULL)
Logical Connectives
• Once we have null values, we must define logical operations AND, OR, NOT using a
three-valued logic.
• In the presence of null values, any row evaluating to false or unknown is eliminated.
• Eliminating such rows has a subtle but significant impact on nested queries involving
EXISTS or UNIQUE
• Duplicates: Two rows are duplicates if the corresponding columns are either equal or
both contain null values
• Contrast this with comparison: Two null values when compared using = is unknown (In
the case of duplicates, this comparison is implicitly treated as true which is an anomaly)
• Arithmetic operations +, −, ∗ and / all return null if one of their operators is null.
• COUNT (∗) treats null values just like other values (included in the count)
• ALL other operators COUNT, SUM, MIN, MAX, AVG and variations using DISTINCT
- discard null values
OUTER JOINS
• Interesting variant of the join operation that rely on null values - Supported in SQL
• In an outer join, sailor rows without a matching Reserves row appear exactly once in the
result, with the columns from the Reserves relation assigned null values
• Variants:
. LEFT OUTER JOIN: Sailor rows without matching Reserves row appear in the
result, not vice versa.
. RIGHT OUTER JOIN: Reserves rows without matching Sailor row appear in the
result, not vice versa.
. FULL OUTER JOIN: Both Sailor and Reserves rows without matching tuples,
appear in the result.
EXAMPLE
• In SQL, OUTER JOIN is specified in the FROM clause.
sid bid
22 101
31 null
58 103
• We can disallow null values by specifying NOT NULL. There is an implicit NOT NULL
constraint for every field listed in a PRIMARY KEY constraint.
• Table Constraints over a single table using CHECK conditional-expression - To ensure that
rating must be a value in the range 1 to 10
• To enforce constraint that ”Interlake” boats cannot be reserved, we use the following.
• When a row is inserted into Reserves, or an existing row is modified, the conditional-
expression in the CHECK condition is evaluated and the command is rejected if the
expression evaluates to false.
Domain Constraints
• INTEGER is the source type for ratingval and every ratingval must be of this type.
rating ratingval
• Comparisons between types and the underlying source types will succeed.
• However, if BoatId and SailorId both have INTEGER as source types and we want com-
parisons between these types to fail, SQL:1999 has the notion of distinct types
• Defines a new distinct type called ratingtype with INTEGER as source type.
• As an example, suppose that number of boats plus the number of sailors should be less
than 100.
• If Sailors is empty, the constraint always holds even if Boats have over 100 rows.
Assertions
EXAMPLES
Certified(eid:integer, aid:integer)
• EVERY pilot is certified for some aircraft and ONLY pilots are certified to fly.
Examples of Queries
• Find the names of aircraft such that ALL pilots certified to operate them earn more than
a 100,000 dollars
• For each pilot who is certified for more than 5 aircrafts, find the eid and maximum
cruisingrange of the aircraft for which he or she is certified.
Examples of Queries
• Print the name and salary of every nonpilot whose salary is more than the average salary
for pilots.
Examples of Queries
• Compute the difference between the average salary of a pilot and the average salary of
all employees (including pilots)
Examples of Queries
• Print enames of pilots who can operate planes with cruising range greater than 3000 miles
but who are not certified on any Boeing aircraft
DB Application Development
Parts of Chapter 6
http://www.cs.ubc.ca/~ramesh/cpsc304
Embedded SQL:
• Static SQL queries
• Can dynamically create queries at runtime and execute them - Dynamic SQL
Embedded SQL
• SQL Statements within a host language - clearly marked so that a preprocessor can deal
with them before compilation
• Host language variables used to pass arguments into an SQL command must be declared
in SQL
• Some special host language variables must be declared in SQL (for communicating error
conditions)
• Complications:
. Data Types recognized by SQl and host language may not match - mismatch
addressed by type casting
. SQL - set oriented. This is addressed by means of cursors
• SQL can refer to variables in the host program. Such variables are prefixed by a colon (:)
• SQL Standard defines correspondence between host language types and SQL types for a
number of host languages
• SQLCODE - returns a negative value when an error condition occurs without specifying
what a particular negative number denotes (type of this in C is long)
• SQLSTATE - Associates predefined values with some common error conditions introduc-
ing some uniformity to how errors are reported (type of this in C is char[6])
EXEC SQL
INSERT INTO Sailors VALUES (:c_sname, :c_sid, :c_rating, :c_age);
• SQLSTATE variable should be checked for errors and exceptions after each embedded
SQL statement
CURSORS
• Impedence Mismatch: SQL operates on sets of records, host languages do not support
a clean set-of-records abstraction.
• CURSOR – Mechanism that allows us to retrieve rows one at a time from a relation
• Once declared, we can open it, fetch the next row, move the cursor or close the cursor
• INSERT, DELETE and UPDATE statements typically require no cursor (some variants
use cursors)
• SELECT usually requires a cursor though if the answer has a single row, it is not needed.
More Cursors
OPEN sinfo
FETCH sinfo INTO :c_sname, :c_age;
CLOSE sinfo
More Cursors
• Scrollable cursor - Variants of the FETCH command can be used to position the cursor
in very flexible ways. Otherwise, only the basic FETCH command can be used to get the
next row
• Insensitive cursor - Cursor behaves as if it is ranging over a private copy of the collection
of answer rows. Otherwise, other actions of transactions could modify these rows causing
come unpredictable behaviour.
• Holdable cursor - Is not closed when the transaction is committed - Motivation from long
transactions in which we access and possibly change a large number of rows of a table.
Aborting a transaction entails redoing a transaction when restarted
• ORDER BY specifies the sort order in which the FETCH command has to retrieve the
rows