Lec5 Relational Algebra
Lec5 Relational Algebra
Database System I
Lecture 5. Relational Algebra
1
What have we learned
• Lec 1. Database History
• Lec 2. Relational Model
• Lec 3-4. SQL
2
Why Relational Algebra matter?
• An essential topic to understand how query
processing and optimization work
• What happened when an SQL is issued to a database?
3
Relational Query Languages
• Query languages allow the manipulation and retrieval
of data from a database
• Recent Years:
• Everything interesting involves a large data set
• QLs are quite powerful for expressing algorithms at scale
4
Formal Query Languages
• Relational Algebra
• Procedural query language
• used to represent execution plans
• Relational Calculus
• Non-procedural (declarative) query language
• Describe what you want, rather than how to compute it
• Foundation for SQL
5
Results of a Query
• Query is a function over relations
Q(R1,...,Rn) = Rresult
Q( )= ,Q( )= ,…
6
Sets v.s. Bags
• Sets: {a, b, c}, {a, d, e, f}, {}, …
• Bags: {a, a, b, c}, {b, b, b, b, b}, …
7
Sets v.s. Bags
• Sets: {a, b, c}, {a, d, e, f}, {}, …
• Bags: {a, a, b, c}, {b, b, b, b, b}, …
8
Relational Algebra Operators
• Core 5 operators
• Selection (s)
• Projection (p)
• Union (∪)
• Set Difference (-)
• Cross product (X)
• Additional operators
• Rename (ρ)
• join (⨝)
• Intersect (∩)
9
Selection
• The selection operator, s (sigma), specifies the
rows to be retained from the input relation
• A selection has the form: scondition(relation), where
condition is a Boolean expression
• Terms in the condition are comparisons between two
fields (or a field and a constant)
• Using one of the comparison operators: <, £, =, ¹, ³, >
• Terms may be connected by Ù (and), or Ú (or),
• Terms may be negated using ¬ (not)
10
Selection Example
sbirth < 1981(Customer)
11
Projection
• The projection operator, p (pi), specifies the
columns to be retained from the input relation
• A selection has the form: p columns(relation)
• Where columns is a comma separated list of column
names
• The list contains the names of the columns to be
retained in the result relation
12
Projection Example
pfirstName,lastName(Customer)
firstName lastName
Customer Buffy Summers
sin firstName lastName birth Xander Harris
111 Buffy Summers 1981 Cordelia Chase
222 Xander Harris 1981 Rupert Giles
333 Cordelia Chase 1980 Dawn Summers
444 Rupert Giles 1955
birth
555 Dawn Summers 1984
1981
1980
pbirth(Customer) 1955
1984
13
Selection and Projection Notes
• Selection and projection eliminate duplicates
• Since relations are sets
• Both operations require one input relation
• The schema of the result of a selection is the same as
the schema of the input relation
• The schema of the result of a projection contains just
those attributes in the projection list
14
Composing Selection and Projection
psin, firstName(sbirth < 1982 Ù lastName = "Summers"(Customer))
sin firstName
111 Buffy
15
Composing Selection and Projection
p birth (sbirth < 1981(Customer))
birth
1980
Customer 1955
sin firstName lastName birth
111 Buffy Summers 1981
222 Xander Harris 1981
333 Cordelia Chase 1980
444 Rupert Giles 1955
555 Dawn Summers 1984
birth
1980
sbirth < 1981 (p birth (Customer)) 1955 16
Commutative property
• For example:
• x+y=y+x
• x*y=y*x
• What about
p firstName(sbirth < 1981 (Customer))?
17
Commutative property
p firstName (sbirth < 1981(Customer))
firstName
Cordelia
Customer Rupert
sin firstName lastName birth
111 Buffy Summers 1981
222 Xander Harris 1981
333 Cordelia Chase 1980
444 Rupert Giles 1955
555 Dawn Summers 1984 firstName
Cordelia
Rupert
ssbirth
p firstName
birth< (s
<1981
1981(p(p
birth firstName
<firstName,
1981 (Customer))
(p firstName,
birth (Customer))
birth (Customer))) 18
Set Operations Review
A = {1, 3, 6} B = {1, 2, 5, 6}
19
Union Compatible Relations
A op B = Rresult
• where op = È, Ç, or -
• A and B must be union compatible
• Same number of fields
• Field i in each schema have the same type
20
Union Compatible Relations
Intersection of the Employee and Customer relations
Customer Employee
sin firstName lastName birth sin firstName lastName salary
111 Buffy Summers 1981 208 Clark Kent 80000.55
222 Xander Harris 1981 111 Buffy Summers 22000.78
333 Cordelia Chase 1980 412 Carol Danvers 64000.00
444 Rupert Giles 1955 The two relations are not union compatible as
555 Dawn Summers 1984 birth is a DATE and salary is a REAL
We can carry out preliminary operations to make the relations union compatible
21
Union Compatible Relations
A op B = Rresult
• where op = È, Ç, or -
• A and B must be union compatible
• Same number of fields
• Field I in each schema have the same type
B
sin firstName lastName B-A sin firstName lastName
208 Clark Kent 208 Clark Kent
111 Buffy Summers 412 Carol Danvers
412 Carol Danvers
24
Note on Set Difference
25
Intersection
A
sin firstName lastName
111 Buffy Summers
222 Xander Harris
333 Cordelia Chase
444 Rupert Giles
555 Dawn Summers sin firstName lastName
• A ∩ B = Rresult
A B
27
Note on Intersect
• A ∩ B = Rresult
A B
A–B= A B
28
Cartesian Product
A(a1, …, an) x B(an+1 , …,am) = Rresult(a1 , …,am)
29
Cartesian Product Example
slastName = "Summers"(Customer) Account
sin firstName lastName birth acc type balance sin
111 Buffy Summers 1981 01 CHQ 2101.76 111
555 Dawn Summers 1984 02 SAV 11300.03 333
03 CHQ 20621.00 444
slastName = "Summers"(Customer) ´ Account
1 firstName lastName birth acc type balance 8
111 Buffy Summers 1981 01 CHQ 2101.76 111
111 Buffy Summers 1981 02 SAV 11300.03 333
111 Buffy Summers 1981 03 CHQ 20621.00 444
555 Dawn Summers 1984 01 CHQ 2101.76 111
555 Dawn Summers 1984 02 SAV 11300.03 333
555 Dawn Summers 1984 03 CHQ 20621.00 444
30
Renaming
• It is sometimes useful to assign names to the
results of a relational algebra query
• The rename operator, r (rho)
• rS(R) renames a relation
• rS(a1,a2,…,an)(R) renames a relation and its attributes
• rnew/old(R) renames specified attributes
R
rsid1/1, sid2/8(R)
31
Largest Balance
• Find the account with the largest balance; return
accNumber
1. Find accounts which are less than some other
account
saccount.balance < d.balance (Account × rd (Account))
32
Relational Algebra Operators
• Core 5 operations
• Selection (s)
• Projection (p)
• Union (∪)
• Set Difference (-)
• Cross product (X)
• Additional operations
• Rename (ρ)
• Intersect (∩)
• Join (⨝ )
33
Relational Algebra Exercises
• Student (sID, lastName, firstName, cgpa)
• 101, Jordan, Michael, 3.8
• Offering (oID, dept, cNum, term, instructor)
• abc, CMPT, 354, Fall 2018, Jiannan
• Took (sID, oID, grade)
• 101, abc, 95
1. sID of all students who have earned some grade over 80 and some grade below 50.
• Theta Join: R ⨝θ S = σθ (R × S)
• Equijoin: R ⨝θ S = σθ (R × S)
• Join condition θ consists only of equalities
36
Natural Join
• There is often a natural way to join two relations
• Join based on common attributes
• Eliminate duplicate common attributes from the result
Customer Employee
sin firstName lastName birth sin firstName lastName salary
111 Buffy Summers 1981 208 Clark Kent 80000.55
222 Xander Harris 1981 111 Buffy Summers 22000.78
333 Cordelia Chase 1980 396 Dawn Allen 41000.21
444 Rupert Giles 1955
Customer ⋈ Employee
sin firstName lastName birth salary
37
111 Buffy Summers 1981 22000.78
Natural Join
R⋈S
• Meaning: R ⋈ S = πA(σθ (R × S))
• Where:
• Selection σθ checks equality of all common attributes
(i.e., attributes with same names)
• Projection πA eliminates duplicate common attributes
• The natural join of two tables with no fields in
common is the Cartesian product
• Not the empty set
38
Natural Join Example
R S
A B C D A B C E
111 Buffy Summers 1981 208 Clark Kent 80000.55
222 Xander Harris 1981 111 Buffy Summers 22000.78
333 Cordelia Chase 1980 396 Dawn Allen 41000.21
444 Rupert Giles 1955
39
Theta Join
R ⋈ θ S = σ θ (R x S)
40
Theta Join Example
Customer Employee
sin firstName lastName birth sin firstName lastName salary
111 Buffy Summers 1981 208 Clark Kent 80000.55
222 Xander Harris 1981 111 Buffy Summers 22000.78
333 Cordelia Chase 1980 412 Carol Danvers 64000.00
444 Rupert Giles 1955
555 Dawn Summers 1984
• Theta Join: R ⨝θ S = σθ (R × S)
• Join of R and S with a join condition θ
• Cross-product followed by selection θ
• No projection
• Equijoin: R ⨝θ S = σθ (R × S)
• Join condition θ consists only of equalities
43
• No projection
Relational Algebra Exercises
• Student (sID, lastName, firstName, cgpa)
• 101, Jordan, Michael, 3.8
• Course (dept, cNum, name, breadth)
• CMPT, 354, DB, True
• Offering (oID, dept, cNum, term, instructor)
• abc, CMPT, 354, Fall 2018, Jiannan
• Took (sID, oID, grade)
• 101, abc, 95
The names of all students who have passed a breadth course (grade >= 60 and breadth = True)
with Martin
plastName, firstName (sbreadth = True Ù grade > 60 Ù instructor = ‘Martin’ (Student ⨝ Took ⨝ Offering ⨝ Course))
44
Different Plans, Same Results
• Semantic equivalence: results are always the same
πname(σcNum=354 (R ⨝ S))
πname(σcNum=354 (R) ⨝ S)
46
Summary
• Relational Algebra (RA) operators
• Five core operators: selection, projection, cross-product,
union and set difference
• Additional operators are defined in terms of the core
operators: rename, intersection, join
• Theorem: SQL and RA can express exactly the same
class of queries
• Multiple RA queries can be equivalent
• Same semantics but difference performance
• Form basis for optimizations