02 04 SQL
02 04 SQL
3. Multi-table queries
• “Not-Yet-SQL?”
Product
PName Price Manufacturer
The number of
attributes is the arity of
the relation
H.U. Computer Engineering Department 13
Data Types in SQL
• Atomic types:
• Characters: CHAR(20), VARCHAR(50)
• Numbers: INT, BIGINT, SMALLINT, FLOAT
• Others: MONEY, DATETIME, …
• i.e. if two tuples agree on the values of the key, then they must be
the same tuple!
Students(sid:string, name:string, gpa: float)
In SQL, we may constrain a column to be NOT NULL, e.g., “name” in this table
H.U. Computer Engineering Department 17
General Constraints
• We can actually specify arbitrary assertions
• E.g. “There cannot be 25 people in the DB class”
20
H.U. Computer Engineering Department 20
What you will learn about in this section
1. The SFW query
1. LIKE
2. DISTINCT
3. ORDER BY
SELECT <attributes>
FROM <one or more relations>
WHERE <conditions>
SELECT *
FROM Product
WHERE Category = ‘Gadgets’
SELECT *
FROM Products
WHERE PName LIKE ‘%gizmo%’
30
H.U. Computer Engineering Department 30
What you will learn about in this section
1. Foreign key constraints
2. Joins: basics
Product
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Answer = {}
for x1 in R1 do
for x2 in R2 do
…..
for xn in Rn do
if Conditions(x1,…, xn)
then Answer = Answer È {(x1.a1, x1.a2, …, xn.ak)}
return Answer
Remembering this order is critical to understanding the output of certain queries (see later on…)
H.U. Computer Engineering Department 44
Note: we say “semantics” not “execution
order”
SELECT Country
FROM Product, Company
WHERE Manufacturer=CName AND Category=‘Gadgets’
3. Advanced SQL-izing
50
H.U. Computer Engineering Department
1. Set Operators & Nested
Queries
51
H.U. Computer Engineering Department 51
What you will learn about in this section
2. Nested queries
52
H.U. Computer Engineering Department
An Unintuitive Query
SELECT DISTINCT R.A
FROM R, S, T
WHERE R.A=S.A OR R.A=T.A
53
H.U. Computer Engineering Department
An Unintuitive Query
SELECT DISTINCT R.A
FROM R, S, T
WHERE R.A=S.A OR R.A=T.A
S T
R But what if S = f?
54
H.U. Computer Engineering Department
An Unintuitive Query
SELECT DISTINCT R.A
FROM R, S, T
WHERE R.A=S.A OR R.A=T.A
• Semantics:
Joins / cross-products are just nested for
1. Take cross-product
loops (in simplest implementation)!
3. Apply projection 56
H.U. Computer Engineering Department
What does this look like in Python?
S T
SELECT DISTINCT R.A
FROM R, S, T R
R Ç (S È T)
WHERE R.A=S.A OR R.A=T.A
output = {}
for r in R:
for s in S:
for t in T:
if r[‘A’] == s[‘A’] or r[‘A’] == t[‘A’]:
output.add(r[‘A’])
return list(output)
58
H.U. Computer Engineering Department 58
Recall Multisets
Multiset X
Tuple
Multiset X
(1, a) Tuple 𝝀(𝑿)
(1, a) (1, a) 2
(1, b) 1
(1, b)
(2, c) 3
(2, c)
Equivalent (1, d) 2
(2, c) Representations
(2, c)
of a Multiset
𝝀 𝑿 = “Count of tuple in X”
(1, d)
(Items not listed have
(1, d) implicit count 0)
59
H.U. Computer Engineering Department
Generalizing Set Operations to Multiset
Operations
Multiset X Multiset Y Multiset Z
Tuple 𝝀(𝑿) Tuple 𝝀(𝒀) Tuple 𝝀(𝒁)
(1, a) 2 (1, a) 5 (1, a) 2
(1, b)
(2, c)
0
3
∩ (1, b)
(2, c)
1
2
= (1, b)
(2, c)
0
2
(1, d) 0 (1, d) 2 (1, d) 0
60
H.U. Computer Engineering Department
Generalizing Set Operations to Multiset
Operations
Multiset X Multiset Y Multiset Z
Tuple 𝝀(𝑿) Tuple 𝝀(𝒀) Tuple 𝝀(𝒁)
(1, a)
(1, b)
2
0 ∪ (1, a)
(1, b)
5
1
=
(1, a)
(1, b)
7
1
(2, c) 3 (2, c) 2 (2, c) 5
(1, d) 0 (1, d) 2 (1, d) 2
For sets,
𝝀 𝒁 =𝝀 𝑿 + 𝝀 𝒀 this is union
61
H.U. Computer Engineering Department
Multiset Operations in SQL
62
H.U. Computer Engineering Department 62
Explicit Set Operators: INTERSECT
SELECT R.A
FROM R, S
WHERE R.A=S.A 𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∩ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}
INTERSECT
SELECT R.A Q1 Q2
FROM R, T
WHERE R.A=T.A
63
H.U. Computer Engineering Department
UNION
SELECT R.A
FROM R, S
WHERE R.A=S.A
𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∪ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}
UNION
Why aren’t there
SELECT R.A Q1 Q2
duplicates?
FROM R, T
WHERE R.A=T.A What if we want
duplicates?
64
H.U. Computer Engineering Department
UNION ALL
SELECT R.A
FROM R, S
WHERE R.A=S.A
𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∪ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}
UNION ALL
65
H.U. Computer Engineering Department
EXCEPT
SELECT R.A
FROM R, S
WHERE R.A=S.A
𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 \{𝑟. 𝐴|𝑟. 𝐴 = 𝑡. 𝐴}
EXCEPT
66
H.U. Computer Engineering Department
INTERSECT: Still some subtle problems…
Company(name, hq_city)
Product(pname, maker, factory_loc)
SELECT hq_city
FROM Company, Product “Headquarters of
WHERE maker = name
AND factory_loc = ‘US’
companies which
make products in
INTERSECT
US AND China”
SELECT hq_city
FROM Company, Product
WHERE maker = name
AND factory_loc = ‘China’
68
H.U. Computer Engineering Department
INTERSECT: Remember the semantics!
Company(name, hq_city) AS C
Example: C JOIN P on maker = name
Product(pname, maker, C.name C.hq_city P.pname P.maker P.factory_loc
factory_loc) AS P X Co. Seattle X X Co. U.S.
Y Inc. Seattle X Y Inc. China
SELECT hq_city
FROM Company, Product
WHERE maker = name
AND factory_loc=‘US’
INTERSECT
X Co has a factory in the US (but not China)
SELECT hq_city Y Inc. has a factory in China (but not US)
FROM Company, Product
WHERE maker = name
AND factory_loc=‘China’
But Seattle is returned by the query!
72
H.U. Computer Engineering Department
Nested Queries
Is this query equivalent?
SELECT c.city
FROM Company c,
Product pr,
Purchase p
WHERE c.name = pr.maker
AND pr.name = p.product
AND p.buyer = ‘Joe Blow’
Beware of duplicates!
73
H.U. Computer Engineering Department
Nested Queries
74
H.U. Computer Engineering Department
Subqueries Returning Relations
You can also use operations of the form: ANY and ALL not supported by
• s > ALL R SQLite.
• s < ANY R
• EXISTS R
78
H.U. Computer Engineering Department
Correlated Queries
Movie(title, year, director, length)
Find movies whose
SELECT DISTINCT title title appears more
FROM Movie AS m than once.
WHERE year <> ANY(
SELECT year
FROM Movie
WHERE title = m.title)
79
H.U. Computer Engineering Department
Complex Correlated Query
Product(name, price, category, maker, year)
81
H.U. Computer Engineering Department
H.U. Computer Engineering Department 82
2. Aggregation & GROUP BY
83
H.U. Computer Engineering Department 83
What you will learn about in this section
1. Aggregation operators
2. GROUP BY
84
H.U. Computer Engineering Department
Aggregation
SELECT AVG(price) SELECT COUNT(*)
FROM Product FROM Product
WHERE maker = “Toyota” WHERE year > 1995
85
H.U. Computer Engineering Department
Aggregation: COUNT
We probably want:
SELECT COUNT(DISTINCT category)
FROM Product
WHERE year > 1995
86
H.U. Computer Engineering Department
More Examples
Purchase(product, date, price, quantity)
87
H.U. Computer Engineering Department
Simple Aggregations
Purchase
Product Date Price Quantity
bagel 10/21 1 20
banana 10/3 0.5 10
banana 10/10 1 10
bagel 10/25 1.50 20
90
H.U. Computer Engineering Department
1. Compute the FROM and WHERE clauses
SELECT product, SUM(price*quantity) AS TotalSales
FROM Purchase
WHERE date > ‘10/1/2005’
GROUP BY product
91
H.U. Computer Engineering Department
2. Group by the attributes in the GROUP BY
SELECT product, SUM(price*quantity) AS TotalSales
FROM Purchase
WHERE date > ‘10/1/2005’
GROUP BY product
92
H.U. Computer Engineering Department
3. Compute the SELECT clause: grouped
attributes and aggregates
SELECT product, SUM(price*quantity) AS TotalSales
FROM Purchase
WHERE date > ‘10/1/2005’
GROUP BY product
93
H.U. Computer Engineering Department
GROUP BY v.s. Nested Quereis
SELECT product, Sum(price*quantity) AS TotalSales
FROM Purchase
WHERE date > ‘10/1/2005’
GROUP BY product
• S = Can ONLY contain attributes a1,…,ak and/or aggregates over other attributes
• C1 = is any condition on the attributes in R1,…,Rn
• C2 = is any condition on the aggregate expressions
96
H.U. Computer Engineering Department
General form of Grouping and Aggregation
SELECT S
FROM R1,…,Rn
WHERE C1
GROUP BY a1,…,ak
HAVING C2
Evaluation steps:
1. Evaluate FROM-WHERE: apply condition C1 on the
attributes in R1,…,Rn
2. GROUP BY the attributes a1,…,ak
3. Apply condition C2 to each group (may have aggregates)
4. Compute aggregates in S and return the result
97
H.U. Computer Engineering Department
Group-by v.s. Nested Query
Author(login, name)
Wrote(login, url)
99
H.U. Computer Engineering Department
Group-by vs. Nested Query
Which way is more efficient?
• Attempt #2- With group-by: How about when written this way?
101
H.U. Computer Engineering Department 101
What you will learn about in this section
1. Quantifiers
2. NULLs
3. Outer Joins
102
H.U. Computer Engineering Department
Quantifiers
Product(name, price, company)
Company(name, city)
An existential quantifier is a
logical quantifier (roughly)
of the form “there exists”
103
H.U. Computer Engineering Department
Quantifiers
Find all companies
Product(name, price, company) with products all
Company(name, city) having price < 100
Equivalent
SELECT DISTINCT Company.cname
FROM Company
WHERE Company.name NOT IN( Find all companies
SELECT Product.company that make only
FROM Product.price >= 100) products with price
< 100
A universal quantifier is of
the form “for all”
104
H.U. Computer Engineering Department
NULLS in SQL
• Whenever we don’t have a value, we can put a NULL
• The schema specifies for each attribute if can be null (nullable attribute) or
not
FALSE = 0
TRUE = 1
UNKNOWN
106
H.U. Computer Engineering Department
Null Values
• C1 AND C2 = min(C1, C2)
• C1 OR C2 = max(C1, C2)
• NOT C1 = 1 – C1
SELECT *
FROM Person
WHERE age < 25 OR age >= 25
108
H.U. Computer Engineering Department
Null Values
Can test for NULL explicitly:
• x IS NULL
• x IS NOT NULL
SELECT *
FROM Person
WHERE age < 25 OR age >= 25
OR age IS NULL
109
H.U. Computer Engineering Department
RECAP: Inner Joins
By default, joins in SQL are “inner joins”:
Product(name, category)
Purchase(prodName, store)
110
H.U. Computer Engineering Department
INNER JOIN:
Product Purchase
name category prodName store
name store
SELECT Product.name, Purchase.store
FROM Product Gizmo Wiz
INNER JOIN Purchase
Camera Ritz
ON Product.name = Purchase.prodName
Camera Wiz
Products that never sold (with no Purchase tuple) will be lost!
111
H.U. Computer Engineering Department
Outer Joins
• An outer join returns tuples from the joined relations that don’t have a
corresponding tuple in the other relations
• I.e. If we join relations A and B on a.X = b.X, and there is an entry in A with X=5, but
none in B with X=5…
• A LEFT OUTER JOIN will return a tuple (a, NULL)!
name store
Gizmo Wiz
SELECT Product.name, Purchase.store
FROM Product Camera Ritz
LEFT OUTER JOIN Purchase
ON Product.name = Purchase.prodName Camera Wiz
OneClick NULL
113
H.U. Computer Engineering Department
Other Outer Joins
114
H.U. Computer Engineering Department
Summary
115
H.U. Computer Engineering Department
Acknowledgements
The course material used for this lecture is mostly taken and/or
adopted from the course materials of the CS145 Introduction to
Databases lecture given by Christopher Ré at Stanford University
(http://web.stanford.edu/class/cs145/).