0% found this document useful (0 votes)
27 views

Week2 Lecture

This document provides an overview of a database course being taught on logical design and the relational algebra. It introduces the professor, course, and chapter being covered. It then reviews queries and relational operators, and how a query is compiled from SQL to relational algebra to physical query plans involving operations like projections, selections, and joins.

Uploaded by

samia lachgar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Week2 Lecture

This document provides an overview of a database course being taught on logical design and the relational algebra. It introduces the professor, course, and chapter being covered. It then reviews queries and relational operators, and how a query is compiled from SQL to relational algebra to physical query plans involving operations like projections, selections, and joins.

Uploaded by

samia lachgar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Course

Introduction to Databases

Professor: Karima Echihabi


Program: Computer Engineering
Session: Winter 2023

1
Databases
Logical Design (The Relational Algebra)
Chapter 4

2
Review

• Motivated the need to study data management


• Explained the importance of the relational model
• Introduced
• Database Management Systems (DBMS)
• The relational model
• Entity-Relationship model

3
Review
Queries Answers

Query Optimization
and Execution

Relational Operators
DBMS
Files and Access Methods

Buffer Management

Disk Space Management

DB

4
Today
Queries Answers

Query Compilation

Relational Operators We are here!


Files and Access Methods

Buffer Management

Disk Space Management

DB

5
Recall our query example
• Given the following instances of Employees and Works_In

cin name lot cin since did


B1234 Mohamed 1 B1234 01/02/2022 11
C5678 Amina 2 C5678 10/08/2020 22
D9012 Imane 3 D9012 15/03/2022 33
E3456 Mohamed 4 E3456 05/06/2009 44

• Return all department ids and parking lots for employees with name “Mohamed”

SELECT E.lot, W.did lot did


FROM Employees E, Works_In W 1 11
WHERE E.cin=W.cin AND E.name=“Mohamed”
4 44
6
Review
SQL Query Relational Algebra
SELECT E.lot, W.did 𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑 ሺσ𝐸⸳𝑛𝑎𝑚𝑒
FROM Employees E, Works_In W Query Compiler = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" ሺ
𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠 ⨝ 𝐸⸳𝑐𝑖𝑛
WHERE E.cin=W.cin = 𝑊⸳𝑐𝑖𝑛 𝑊𝑜𝑟𝑘𝑠_𝐼𝑛ሻሻ
AND E.name=“Mohamed” Relational Algebra == Logical Query Plan
Logical Query Plan Physical Query Plan
𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑 On-the-fly
𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑
Project
Iterator
σ𝐸⸳𝑛𝑎𝑚𝑒 = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" Indexed
Optimization ⨝ 𝐸⸳𝑐𝑖𝑛 = 𝑊⸳𝑐𝑖𝑛 Nested Loop
Join Iterator
⨝ 𝐸⸳𝑐𝑖𝑛 = 𝑊⸳𝑐𝑖𝑛 σ𝐸⸳𝑛𝑎𝑚𝑒 = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" Works_in
B+tree Heap Scan
Index Scan Iterator
Employees Works_in Iterator Employees
Review
SQL Query Relational Algebra
SELECT E.lot, W.did 𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑 ሺσ𝐸⸳𝑛𝑎𝑚𝑒
FROM Employees E, Works_In W Query Compiler = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" ሺ
𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠 ⨝ 𝐸⸳𝑐𝑖𝑛
WHERE E.cin=W.cin = 𝑊⸳𝑐𝑖𝑛 𝑊𝑜𝑟𝑘𝑠_𝐼𝑛ሻሻ
AND E.name=“Mohamed” Relational Algebra == Logical Query Plan
Logical Query Plan Physical Query Plan

Parsing
𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑 On-the-fly
𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑
Project
Iterator
σ𝐸⸳𝑛𝑎𝑚𝑒 = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" Indexed
⨝ 𝐸⸳𝑐𝑖𝑛 = 𝑊⸳𝑐𝑖𝑛 Nested Loop
Join Iterator
⨝ 𝐸⸳𝑐𝑖𝑛 = 𝑊⸳𝑐𝑖𝑛 σ𝐸⸳𝑛𝑎𝑚𝑒 = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" Works_in
B+tree Heap Scan
Index Scan Iterator
Employees Works_in Iterator Employees
Review
SQL Query Relational Algebra
SELECT E.lot, W.did 𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑 ሺσ𝐸⸳𝑛𝑎𝑚𝑒
FROM Employees E, Works_In W Query Compiler = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" ሺ
𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠 ⨝ 𝐸⸳𝑐𝑖𝑛
WHERE E.cin=W.cin = 𝑊⸳𝑐𝑖𝑛 𝑊𝑜𝑟𝑘𝑠_𝐼𝑛ሻሻ
AND E.name=“Mohamed” Relational Algebra == Logical Query Plan
Logical Query Plan Physical Query Plan

Parsing
𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑 On-the-fly
𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑
Project
Iterator
σ𝐸⸳𝑛𝑎𝑚𝑒 = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" Indexed
Optimization ⨝ 𝐸⸳𝑐𝑖𝑛 = 𝑊⸳𝑐𝑖𝑛 Nested Loop
Join Iterator
⨝ 𝐸⸳𝑐𝑖𝑛 = 𝑊⸳𝑐𝑖𝑛 σ𝐸⸳𝑛𝑎𝑚𝑒 = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" Works_in
B+tree Heap Scan
Index Scan Iterator
Employees Works_in Iterator Employees
Review
SQL Query Relational Algebra
SELECT E.lot, W.did 𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑 ሺσ𝐸⸳𝑛𝑎𝑚𝑒
FROM Employees E, Works_In W Query Compiler = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" ሺ
𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠 ⨝ 𝐸⸳𝑐𝑖𝑛
WHERE E.cin=W.cin = 𝑊⸳𝑐𝑖𝑛 𝑊𝑜𝑟𝑘𝑠_𝐼𝑛ሻሻ
AND E.name=“Mohamed” Relational Algebra == Logical Query Plan

Relational Algebra: Relational Calculus:


First Order
A procedural Equivalent A high-level Equivalent
Logic
query language SQLdeclarative query
is a declarative language used for interacting with
databases. It lets you describe what you want to do with data,
language
like retrieving, adding, updating, or deleting, without specifying
how to do it. The database engine figures out the efficient way to
carry out these actions, making SQL concise and readable.

Codd’s Thereom (1972)


Codd’s Thereom (1972)

SQL is one implementation of Relational Calculus


Review
SQL Query Relational Algebra
SELECT E.lot, W.did 𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑 ሺσ𝐸⸳𝑛𝑎𝑚𝑒
FROM Employees E, Works_In W Query Compiler = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" ሺ
𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠 ⨝ 𝐸⸳𝑐𝑖𝑛
WHERE E.cin=W.cin = 𝑊⸳𝑐𝑖𝑛 𝑊𝑜𝑟𝑘𝑠_𝐼𝑛ሻሻ
AND E.name=“Mohamed” Relational Algebra == Logical Query Plan

Relational Algebra: Relational Calculus:


First Order
A procedural Equivalent A high-level Equivalent
Logic
query language declarative query
language

Codd’s Thereom (1972)


Codd’s Thereom (1972)

SQL is one implementation of Relational Calculus


Review
SQL Query Relational Algebra
SELECT E.lot, W.did 𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑 ሺσ𝐸⸳𝑛𝑎𝑚𝑒
FROM Employees E, Works_In W Query Compiler = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" ሺ
𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠 ⨝ 𝐸⸳𝑐𝑖𝑛
WHERE E.cin=W.cin = 𝑊⸳𝑐𝑖𝑛 𝑊𝑜𝑟𝑘𝑠_𝐼𝑛ሻሻ
AND E.name=“Mohamed” Relational Algebra == Logical Query Plan

Relational Algebra: Relational Calculus:


First Order
A procedural Equivalent A high-level Equivalent
Logic
query language declarative query
language

Codd’s Thereom (1972)


Codd’s Thereom (1972)

SQL is one implementation of Relational Calculus


Review
SQL Query Relational Algebra
SELECT E.lot, W.did 𝜋𝐸⸳𝑙𝑜𝑡⸴𝑊⸳𝑑𝑖𝑑 ሺσ𝐸⸳𝑛𝑎𝑚𝑒
FROM Employees E, Works_In W Query Compiler = "𝑀𝑜ℎ𝑎𝑚𝑒𝑑" ሺ
𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠 ⨝ 𝐸⸳𝑐𝑖𝑛
WHERE E.cin=W.cin = 𝑊⸳𝑐𝑖𝑛 𝑊𝑜𝑟𝑘𝑠_𝐼𝑛ሻሻ
AND E.name=“Mohamed” Relational Algebra == Logical Query Plan

Relational Algebra: Relational Calculus:


First Order
A procedural Equivalent A high-level Equivalent
Logic
query language declarative query
language

Codd’s Thereom (1972)

SQL is one implementation of Relational Calculus


Relational Query Languages

• Query languages: Allow manipulation and retrieval of data from a


database.
• Relational model supports simple, powerful QLs:
• Strong formal foundation based on logic.
• Allows for much optimization.
• Query Languages != programming languages!
• QLs not expected to be “Turing complete”.
• QLs not intended to be used for complex calculations.
• QLs support easy, efficient access to large data sets.

14
Formal Relational Query Languages

• Two mathematical Query Languages form the basis for “real”


languages (e.g. SQL), and for implementation:
• Relational Algebra: More operational, very useful for representing
execution plans.
• Relational Calculus: Lets users describe what they want, rather than how
to compute it. (Non-operational, declarative.)

15
Why Learn Relational Algebra

• It is the basis for SQL


• It is the query language used by researchers in scientific papers
• It will help you learn why queries behave a certain way and thus
equip you to write better optimized queries
• An RDBMS internally converts SQL to (an extended form of)
relational algebra before performing optimization

16
Relational Algebra

• Pure relational algebra operates on sets


• No duplicate tuples
• In contrast to SQL, which operates on bags (multisets)

17
Preliminaries

• A query is applied to relation instances, and the result of a query is


also a relation instance.
• Schemas of input relations for a query are fixed (but query will run
regardless of instance!)
• The schema for the result of a given query is also fixed! Determined by
definition of query language constructs.
• Positional vs. named-field notation:
• Positional notation easier for formal definitions, named-field notation more
readable.
• Both used in SQL

18
Example Instances R1 sid bid day
22 101 10/10/96
• “Sailors” and “Reserves” 58 103 11/12/96
relations for our
examples. S1 sid sname rating age
• We’ll use positional or 22 dustin 7 45.0
named field notation,
assume that names of
31 lubber 8 55.5
fields in query results are 58 rusty 10 35.0
`inherited’ from names of
fields in query input S2
sid sname rating age
relations. 28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
19
Relational Algebra
• Pure relational algebra operates on sets
• No duplicate tuples
• In contrast to SQL, which operates on bags (multisets)
• Basic operations:
• 
Selection ( ) Selects a subset of rows from relation.


• Projection ( ) Deletes unwanted columns from relation.
• Cross-product ( ) Allows us to combine two relations.



Set-difference ( ) Tuples in relation 1 but not in relation 2.
Union (  ) Tuples in relation1 and in relation 2.
• Additional operations:
• Intersection, join, division, renaming: Not essential, but (very!) useful.
• Since each operation returns a relation, operations can be composed!
(Algebra is “closed”.)
Relational Algebra
• Additional operations that are not essential, but (very!) useful
• Intersection ( ∩ ) A union B= AintersB + (A-B) + (B-A) A inters B=A-(A-B)

• Join (⨝ or ⨝θ )
• Division (/)
• Renaming (ᵨ)

• Since each operation returns a relation, operations can be


composed! (Algebra is “closed”.)
Relational Algebra vs SQL

22
Relational Algebra vs SQL

23
Projection sid sname rating age
S2
28 yuppy 9 35.0
• Corresponds to a SELECT in 31 lubber 8 55.5
SQL
44 guppy 5 35.0
• Deletes attributes that are not 58 rusty 10 35.0
in projection list.
• Schema of result contains
exactly the fields in the sname rating
projection list, with the same yuppy 9
names that they had in the
(only) input relation. lubber 8
guppy 5
rusty 10

 sname,rating(S2)
24
Projection sid sname rating age
S2
28 yuppy 9 35.0
• Projection operator has to 31 lubber 8 55.5
eliminate duplicates! (Why??)
44 guppy 5 35.0
• Note: real systems typically
58 rusty 10 35.0
don’t do duplicate
elimination unless the user
explicitly asks for it. (Why sname rating
not?)
yuppy 9
why sql do not eleminate duplicates: lubber 8 age
we don't need duplicates but eliminating them is
really consuming guppy 5 35.0
OR we need duplicates like when we want to
know how many mohamed is in the class etc rusty 10 55.5
 sname,rating(S2)  age(S2)
25
Selection
• Corresponds to the WHERE  rating 8(S2)
clause in SQL
• Selects rows that satisfy selection sid sname rating age
condition.
28 yuppy 9 35.0
• Schema of result identical to
schema of (only) input relation. 58 rusty 10 35.0
• No duplicates in result! (Why?)
 sname,rating( rating 8(S2))
S2
sid sname rating age
28 yuppy 9 35.0 sname rating
• Result relation can
31 lubber 8 be the input for
55.5
another relational algebra yuppy 9
44 guppy 5 35.0
operation! (Operator composition.) rusty 10
58 rusty 10 35.0
26
Selection
• Result relation can be the input for  rating 8(S2)
another relational algebra
operation! (Operator composition.)
sid sname rating age
28 yuppy 9 35.0
58 rusty 10 35.0

 sname,rating( rating 8(S2))


S2
sid sname rating age
28 yuppy 9 35.0 sname rating
31 lubber 8 55.5 yuppy 9
44 guppy 5 35.0
rusty 10
58 rusty 10 35.0
27
Union, Intersection, Set-Difference
• All of these operations take two input relations, which must
be union-compatible:
• Same number of fields.
• `Corresponding’ fields have the same type.

• What is the schema of result?


• Union → UNION or UNION ALL in SQL
eliminate duplicate keeps duplicates
• Set-Difference → EXCEPT and not commutative
Union, Intersection, Set-Difference

S1 sid sname rating age S2


sid sname rating age
22 dustin 7 45.0 28 yuppy 9 35.0
31 lubber 8 55.5
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
58 rusty 10 35.0
29

SS11SS22sid sname rating age S1− S2 sid sname rating age


22 dustin 7 45.0 22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0 S1 S2 sid sname rating age
44 guppy 5 35.0 31 lubber 8 55.5
28 yuppy 9 35.0 58 rusty 10 35.0 29
Union, Intersection, Set-Difference

S1 sid sname rating age S2


sid sname rating age
22 dustin 7 45.0 28 yuppy 9 35.0
31 lubber 8 55.5
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
58 rusty 10 35.0
30

SS11SS22sid
sid sname
sname rating
rating age
age
S1− S2 sid sname rating age
22 dustin 77 45.0 22 dustin 7 45.0
22 dustin 45.0
31
31 lubber
lubber 88 55.5
55.5
58
58 rusty
rusty 10
10 35.0
35.0 S1 S2 sid sname rating age
44
44 guppy
guppy 55 35.0
35.0 31 lubber 8 55.5
28
28 yuppy
yuppy 99 35.0
35.0 58 rusty 10 35.0 30
Union, Intersection, Set-Difference

S1 sid sname rating age S2


sid sname rating age
22 dustin 7 45.0 28 yuppy 9 35.0
31 lubber 8 55.5
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
58 rusty 10 35.0
31

SS11SS22sid
sid sname
sname rating
rating age
age
SS11−−SS22 sid sname rating age
22 dustin 77 45.0 22 dustin 7 45.0
22 dustin 45.0
31
31 lubber
lubber 88 55.5
55.5
58
58 rusty
rusty 10
10 35.0
35.0 S1S
1S2S2 sid sname rating age
44
44 guppy
guppy 55 35.0
35.0 31 lubber 8 55.5
28
28 yuppy
yuppy 99 35.0
35.0 58 rusty 10 35.0 31
Union, Intersection, Set-Difference

S1 sid sname rating age S2


sid sname rating age
22 dustin 7 45.0 28 yuppy 9 35.0
31 lubber 8 55.5
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
58 rusty 10 35.0
32

SS11SS22sid
sid sname
sname rating
rating age
age
SS11−−SS22 sid
sid sname
sname rating
rating age
age
22 dustin 77 45.0 22
22 dustin
dustin 77 45.0
45.0
22 dustin 45.0
31
31 lubber
lubber 88 55.5
55.5
58
58 rusty
rusty 10
10 35.0
35.0 S1S
1S2S2 sid
sid sname
sname rating
rating age
age
44
44 guppy
guppy 55 35.0
35.0 31
31 lubber
lubber 88 55.5
55.5
28
28 yuppy
yuppy 99 35.0
35.0 58
58 rusty
rusty 1010 35.0
35.0 32
Cross-Product

• Each row of S1 is paired with each row of R1.


• Result schema has one field per field of S1 and R1, with field
names `inherited’ if possible.
• Conflict: Both S1 and R1 have a field called sid.
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/96
▪ Renaming operator:
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96  (C(1→ sid1, 5 → sid 2), S1 R1)
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96 C is the Renaming List
58 rusty 10 35.0 58 103 11/12/96 output Position → Input
name New name 33
Renaming
▪ Renaming operator:
 (C(1→ sid1, 5 → sid 2), S1 R1)

C is the Renaming List


output Position → Input
name New name
(sid) sname rating age (sid) bid34 day (sid)
sid1 sname rating age
sid1 (sid)
sid2 bid day
22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 22 101 10/10/96
22 dustin 7 45.0 58 103 11/12/96 22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96 31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 22 101 10/10/96
58 rusty 10 35.0 58 103 11/12/96 58 rusty 10 35.0 58 103 11/12/96
Joins

• Condition Join: R  c S =  c ( R  S)

• Result schema same as that of cross-product


• Note that the output can have duplicate column names so we need
to apply the rename operator.
• Fewer tuples than cross-product, might be able to compute more
efficiently
• Also called a theta-join.

35
36
Joins

S1 sid sname rating age R1 sid bid day


22 dustin 7 45.0 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0

S1  R1
S1. sid  R1. sid 37
Joins

S1 sid sname rating age R1 sid bid day


22 dustin 7 45.0 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0

(sid) sname rating age (sid) bid day


22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 58 103 11/12/96
S1  R1
S1. sid  R1. sid 38
Joins

S1 sid sname rating age R1 sid bid day


22 dustin 7 45.0 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0

(sid)
sid1 sname rating age (sid)
sid2 bid day
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 58 103 11/12/96

Need to apply the renaming operator! 39


Joins
• Equi-Join: A special case of condition join where the condition c
contains only equalities.
• Result schema similar to cross-product, with output having
duplicate column names so need the renaming operator
S1 sid sname rating age R1 sid bid day
22 dustin 7 45.0 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0

S1  R1
sid
40
Joins
• Equi-Join: A special case of condition join where the condition c
contains only equalities.
• Result schema similar to cross-product, with output having
duplicate column names so need the renaming operator
S1 sid sname rating age R1 sid bid day
22 dustin 7 45.0 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 Need to apply the renaming operator!
sid1 sid2

S1  R1
sid
41
Joins removes duplicates columns

• Natural Join: Equijoin on all matching column names.


S1 sid sname rating age R1 sid bid day
22 dustin 7 45.0 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0

sid sname rating age bid day


S1  R1 22 dustin 7 45.0 101 10/10/96
sid 58 rusty 10 35.0 103 11/12/96
42
Division
• Not supported as a primitive operator, but useful for expressing
queries like:
Find sailors who have reserved all boats.
• Let A have 2 fields, x and y; B have only field y:
• A/B =  x |  x , y  A  y  B
• i.e., A/B contains all x tuples (sailors) such that for every y tuple
(boat) in B, there is an xy tuple in A.
• Or: If the set of y values (boats) associated with an x value (sailor) in A
contains all y values in B, the x value is in A/B.
• In general, x and y can be any lists of fields; y is the list of fields in
B, and x  y is the list of fields of A.

43
Examples of Division A/B

sno pno pno pno pno


s1 p1 p2 p2 p1
s1 p2 p4 p2
s1 p3
B1 p4
B2
s1 p4
s2 p1 sno B3
s2 p2 s1
s3 p2 s2 sno
s4 p2 s3 s1 sno
s4 p4 s4 s4 s1

A A/B1 A/B2 A/B3


44
Expressing A/B Using Basic Operators

• Division is not essential op; just a useful shorthand.


• (Also true of joins, but joins are so common that systems implement joins
specially.)
• Idea: For A/B, compute all x values that are not `disqualified’ by
some y value in B.
• x value is disqualified if by attaching y value from B, we obtain an xy tuple
that is not in A.

Disqualified x values:  x (( x ( A) B) − A)


A/B:  x ( A) − all disqualified tuples
45
Find names of sailors who’ve reserved boat #103

46
Find names of sailors who’ve reserved boat #103

❖ Solution 1:  sname(( Reserves)  Sailors)


bid =103
47
Find names of sailors who’ve reserved boat #103

❖ Solution 2:  (Temp1,  Re serves)


bid = 103
 ( Temp2, Temp1  Sailors)
 sname (Temp2) 48
Find names of sailors who’ve reserved boat #103

❖ Solution 3:  sname ( (Re serves  Sailors))


bid =103
49
Find names of sailors who’ve reserved a red boat

• Information about boat color only available in Boats; so need an


extra join:

 sname (( Boats)  Re serves  Sailors)


color =' red '

❖ A more efficient solution:


 sname ( ((  Boats)  Re s)  Sailors)
sid bid color =' red '

A query optimizer can find this, given the first solution!


50
Find sailors who’ve reserved a red or a green boat

• Can identify all red or green boats, then find sailors who’ve
reserved one of these boats:

 (Tempboats, ( Boats))
color =' red '  color =' green '
 sname(Tempboats  Re serves  Sailors)

❖ Can also define Tempboats using union! (How?)


❖ What happens if  is replaced by  in this query?
51
Find sailors who’ve reserved a red and a green boat

• Previous approach won’t work! Must identify sailors who’ve


reserved red boats, sailors who’ve reserved green boats, then find
the intersection (note that sid is a key for Sailors):

 (Tempred,  (( Boats)  Re serves))


sid color =' red '
 (Tempgreen,  (( Boats)  Re serves))
sid color =' green'

 sname((Tempred  Tempgreen)  Sailors)


52
Find the names of sailors who’ve reserved all boats

• Uses division; schemas of the input relations to / must be carefully


chosen:

 (Tempsids, ( Re serves) / ( Boats))


sid, bid bid
 sname (Tempsids  Sailors)

❖ To find sailors who’ve reserved all ‘Interlake’ boats:


..... / ( Boats)
bid bname =' Interlake'
53
Summary

• The relational model has rigorously defined query languages that


are simple and powerful.
• Relational algebra is more operational; useful as internal
representation for query evaluation plans.
• Several ways of expressing a given query; a query optimizer
should choose the most efficient version.

54

You might also like