4 sql1
4 sql1
Select-From-Where
Statements
Multirelation Queries
Subqueries
1
Why SQL?
SQL is a very-high-level language.
Say “what to do” rather than “how to do it.”
Avoid a lot of data-manipulation details
needed in procedural languages like C++
or Java.
Database management system figures
out “best” way to execute query.
Called “query optimization.”
2
Select-From-Where
Statements
SELECT desired attributes
FROM one or more tables
WHERE condition about tuples of
the tables
3
Our Running Example
All our SQL queries will be based on the
following database schema.
Underline indicates key attributes.
drinks(name, manf)
cafes(name, addr, license)
Drinkers(name, addr, phone)
Likes(drinker, drink)
Sells(cafe, drink, price)
Frequents(drinker, cafe)
4
Example
Using drinks(name, manf), what drinks
are made by Anheuser-Busch?
SELECT name
FROM drinks
WHERE manf = ’Anheuser-
Busch’;
5
Result of Query
name
coke
coke Lite
Michelob
...
The answer is a relation with a single attribute,
name, and tuples with the name of each drink
by Anheuser-Busch, such as coke.
6
Meaning of Single-Relation
Query
Begin with the relation in the
FROM clause.
Apply the selection indicated by
the WHERE clause.
Apply the extended projection
indicated by the SELECT clause.
7
Operational Semantics
name manf
Tuple-variable t
loops over all
tuples 8
Operational Semantics ---
General
Think of a tuple variable visiting
each tuple of the relation
mentioned in FROM.
Check if the “current” tuple satisfies
the WHERE clause.
If so, compute the attributes or
expressions of the SELECT clause
using the components of this tuple.
9
* In SELECT clauses
When there is one relation in the FROM
clause, * in the SELECT clause stands for
“all attributes of this relation.”
Example: Using drinks(name, manf):
SELECT *
FROM drinks
WHERE manf = ’Anheuser-Busch’;
10
Result of Query:
name manf
coke Anheuser-
Busch
coke Lite Anheuser-
Busch
Michelob Anheuser-Busch
Now, the result has each of the attributes
... ...
of drinks.
11
Renaming Attributes
If you want the result to have different
attribute names, use “AS <new
name>” to rename an attribute.
Example: Using drinks(name, manf):
SELECT name AS drink, manf
FROM drinks
WHERE manf = ’Anheuser-Busch’
12
Result of Query:
drink manf
coke Anheuser-
Busch
coke Lite Anheuser-
Busch
Michelob Anheuser-Busch
... ...
13
Expressions in SELECT
Clauses
Any expression that makes sense can
appear as an element of a SELECT
clause.
Example: Using Sells(cafe, drink,
price):
SELECT cafe, drink,
price*114 AS priceInYen
FROM Sells;
14
Result of Query
cafe drink
priceInYen
Joe’s coke 285
Sue’s pepsi 342
… … …
15
Example: Constants as
Expressions
Using Likes(drinker, drink):
SELECT drinker,
’likes coke’ AS
whoLikescoke
FROM Likes
WHERE drink = ’coke’;
16
Result of Query
drinker whoLikescoke
Sally likes coke
Fred likes coke
… …
17
Example: Information
Integration
We often build “data warehouses”
from the data at many “sources.”
Suppose each cafe has its own
relation Menu(drink, price) .
To contribute to Sells(cafe, drink,
price) we need to query each cafe
and insert the name of the cafe.
18
Information Integration ---
(2)
For instance, at Joe’s cafe we can
issue the query:
SELECT ’Joe’’s cafe’, drink,
price
FROM Menu;
19
Complex Conditions in
WHERE Clause
Boolean operators AND, OR, NOT.
Comparisons =, <>, <, >, <=, >=.
And many other operators that produce
boolean-valued results.
20
Example: Complex
Condition
Using Sells(cafe, drink, price), find the
price Joe’s cafe charges for coke:
SELECT price
FROM Sells
WHERE cafe = ’Joe’’s cafe’ AND
drink = ’coke’;
21
Patterns
A condition can compare a string
to a pattern by:
<Attribute> LIKE <pattern> or
<Attribute> NOT LIKE <pattern>
Pattern is a quoted string with %
= “any string”; _ = “any
character.”
22
Example: LIKE
Using Drinkers(name, addr, phone)
find the drinkers with exchange 555:
SELECT name
FROM Drinkers
WHERE phone LIKE ’%555-_ _ _ _’;
23
NULL Values
Tuples in SQL relations can have NULL
as a value for one or more components.
Meaning depends on context. Two
common cases:
Missing value : e.g., we know Joe’s cafe has
some address, but we don’t know what it is.
Inapplicable : e.g., the value of attribute
spouse for an unmarried person.
24
Comparing NULL’s to
Values
The logic of conditions in SQL is really
3-valued logic: TRUE, FALSE,
UNKNOWN.
Comparing any value (including NULL
itself) with NULL yields UNKNOWN.
A tuple is in a query answer iff the
WHERE clause is TRUE (not FALSE or
UNKNOWN).
25
Three-Valued Logic
To understand how AND, OR, and NOT
work in 3-valued logic, think of TRUE
= 1, FALSE = 0, and UNKNOWN = ½.
AND = MIN; OR = MAX, NOT(x) = 1-x.
Example:
TRUE AND (FALSE OR NOT(UNKNOWN))
= MIN(1, MAX(0, (1 - ½ ))) =
MIN(1, MAX(0, ½ )) = MIN(1, ½ ) = ½.
26
Surprising Example
From the following Sells relation:
cafe drink price
Joe’s cafe coke
NULL
SELECT cafe
FROM Sells
WHERE price < 2.00 OR price >=
UNKNOWN UNKNOWN
2.00;
UNKNOWN 27
Reason: 2-Valued Laws !
= 3-Valued Laws
Some common laws, like
commutativity of AND, hold in 3-
valued logic.
But not others, e.g., the law of the
excluded middle : p OR NOT p =
TRUE.
When p = UNKNOWN, the left side is
MAX( ½, (1 – ½ )) = ½ != 1.
28
Multirelation Queries
Interesting queries often combine
data from more than one relation.
We can address several relations
in one query by listing them all in
the FROM clause.
Distinguish attributes of the same
name by “<relation>.<attribute>”
.
29
Example: Joining Two Relations
Using relations Likes(drinker, drink)
and Frequents(drinker, cafe), find the
drinks liked by at least one person
who frequents Joe’s cafe.
SELECT drink
FROM Likes, Frequents
WHERE cafe = ’Joe’’s cafe’ AND
Frequents.drinker =
Likes.drinker;
30
Formal Semantics
Almost the same as for single-
relation queries:
1. Start with the product of all the
relations in the FROM clause.
2. Apply the selection condition from
the WHERE clause.
3. Project onto the list of attributes and
expressions in the SELECT clause.
31
Operational Semantics
Imagine one tuple-variable for each
relation in the FROM clause.
These tuple-variables visit each
combination of tuples, one from each
relation.
If the tuple-variables are pointing to
tuples that satisfy the WHERE clause,
send these tuples to the SELECT
clause.
32
Example
drinker cafe drinker drink
tv1 tv2
Sally coke
Sally Joe’s
check
for Joe Likes
Frequents
check these to output
are equal
33
Explicit Tuple-Variables
Sometimes, a query needs to use
two copies of the same relation.
Distinguish copies by following the
relation name by the name of a
tuple-variable, in the FROM clause.
It’s always an option to rename
relations this way, even when not
essential.
34
Example: Self-Join
From drinks(name, manf), find all
pairs of drinks by the same
manufacturer.
Do not produce pairs like (coke, coke).
Produce pairs in alphabetic order, e.g.
(coke, pepsi), not (pepsi, coke).
SELECT b1.name, b2.name
FROM drinks b1, drinks b2
WHERE b1.manf = b2.manf AND
b1.name < b2.name;
35
Subqueries
A parenthesized SELECT-FROM-
WHERE statement (subquery ) can
be used as a value in a number of
places, including FROM and WHERE
clauses.
Example: in place of a relation in the
FROM clause, we can use a
subquery and then query its result.
Must use a tuple-variable to name
tuples of the result. 36
Example: Subquery in
FROM
Find the drinks liked by at least one
person who frequents Joe’s cafe.
Drinkers who
frequent Joe’s cafe
SELECT drink
FROM Likes, (SELECT drinker
FROM Frequents
WHERE cafe = ’Joe’’s
cafe’)JD
WHERE Likes.drinker = JD.drinker; 37
Subqueries That Return One
Tuple
If a subquery is guaranteed to
produce one tuple, then the
subquery can be used as a value.
Usually, the tuple has one
component.
A run-time error occurs if there is no
tuple or more than one tuple.
38
Example: Single-Tuple
Subquery
Using Sells(cafe, drink, price), find
the cafes that serve pepsi for the
same price Joe charges for coke.
Two queries would surely work:
1. Find the price Joe charges for coke.
2. Find the cafes that serve pepsi at that
price.
39
Query + Subquery
Solution
SELECT cafe
FROM Sells
WHERE drink = ’pepsi’ AND
price = (SELECT price
FROM Sells
The price at WHERE cafe = ’Joe’’s
which Joe
cafe’
sells coke
AND drink = ’coke’);
40
The IN Operator
<tuple> IN (<subquery>) is true if
and only if the tuple is a member
of the relation produced by the
subquery.
Opposite: <tuple> NOT IN
(<subquery>).
IN-expressions can appear in
WHERE clauses.
41
Example: IN
Using drinks(name, manf) and
Likes(drinker, drink), find the name and
manufacturer of each drink that Fred likes.
SELECT *
FROM drinks
WHERE name IN (SELECT drink
FROM Likes
The set of
WHERE
drinks drinker
Fred = ’Fred’);
likes
42
Remember These From Lecture
#1?
SELECT a
FROM R, S
WHERE R.b = S.b;
SELECT a
FROM R
WHERE b IN (SELECT b FROM S);
43
IN is a Predicate About R’s
Tuples
SELECT a Two 2’s
FROM R
WHERE b IN (SELECT b FROM S);
a b b c (1,2) satisfies
1 2 2 5 the condition;
3 4 2 6 1 is output once.
One loop, over
R S
the tuples of R
44
This Query Pairs Tuples from R,
S
SELECT a
FROM R, S
WHERE R.b = S.b;
a b b c (1,2) with (2,5)
1 2 2 5 and (1,2) with
3 4 2 6 (2,6) both satisfy
Double loop, over
S
the tuples of R and S R the condition;
1 is output twice.
45
The Exists Operator
EXISTS(<subquery>) is true if and
only if the subquery result is not
empty.
Example: From drinks(name, manf)
, find those drinks that are the
unique drink by their
manufacturer.
46
Example: EXISTS
Notice scope rule: manf refers
SELECT name to closest nested FROM with
FROM drinks b1 a relation having that attribute
52
Notice trick:
subquery is
really a stored
table. Solution
The drinker frequents
a cafe that sells the
(SELECT * FROM Likes) drink.
INTERSECT
(SELECT drinker, drink
FROM Sells, Frequents
WHERE Frequents.cafe = Sells.cafe
);
53
Bag Semantics
Although the SELECT-FROM-
WHERE statement uses bag
semantics, the default for union,
intersection, and difference is set
semantics.
That is, duplicates are eliminated as
the operation is applied.
54
Motivation: Efficiency
When doing projection, it is easier
to avoid eliminating duplicates.
Just work tuple-at-a-time.
For intersection or difference, it is
most efficient to sort the relations
first.
At that point you may as well
eliminate the duplicates anyway.
55
Controlling Duplicate
Elimination
Force the result to be a set by
SELECT DISTINCT . . .
Force the result to be a bag (i.e.,
don’t eliminate duplicates) by ALL,
as in . . . UNION ALL . . .
56
Example: DISTINCT
From Sells(cafe, drink, price), find all
the different prices charged for drinks:
SELECT DISTINCT price
FROM Sells;
Notice that without DISTINCT, each
price would be listed as many times
as there were cafe/drink pairs at that
price.
57
Example: ALL
Using relations Frequents(drinker, cafe)
and Likes(drinker, drink):
(SELECT drinker FROM
Frequents)
EXCEPT ALL
(SELECT drinker FROM Likes);
Lists drinkers who frequent more cafes
than they like drinks, and does so as
many times as the difference of those
counts. 58
Join Expressions
SQL provides several versions of
(bag) joins.
These expressions can be stand-
alone queries or used in place of
relations in a FROM clause.
59
Products and Natural Joins
Natural join:
R NATURAL JOIN S;
Product:
R CROSS JOIN S;
Example:
Likes NATURAL JOIN Sells;
Relations can be parenthesized
subqueries, as well.
60
Theta Join
R JOIN S ON <condition>
Example: using Drinkers(name, addr)
and Frequents(drinker, cafe):
Drinkers JOIN Frequents ON
name = drinker;
gives us all (d, a, d, b) quadruples
such that drinker d lives at address
a and frequents cafe b.
61