0% found this document useful (0 votes)
14 views73 pages

Chapter 4 - B - Relational - Algebra II

ssasa

Uploaded by

YouTubeATP
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views73 pages

Chapter 4 - B - Relational - Algebra II

ssasa

Uploaded by

YouTubeATP
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Chapter 4B.

Relational
Algebra II
COMP3278 Introduction to
Database Management Systems

Department of Computer Science, The University of Hong Kong


Slides prepared by - Dr. Chui Chun Kit, http://www.cs.hku.hk/~ckchui/ for students in COMP3278
For other uses, please email : [email protected]
Content
Additional relational algebra operations
Extended relational algebra operations
Equivalence rules
A simple illustration of query optimization

2
Section 1

Additional
operators

Slides prepared by - Dr. Chui Chun Kit, http://www.cs.hku.hk/~ckchui/ for students in COMP3278
For other uses, please email : [email protected]
Motivation
The fundamental operators of the relational algebra
introduced in Chapter 4A are sufficient to express
any relational-algebra query.
However, if we restrict ourselves to just the
fundamental operators, certain common queries are
lengthy to express.
Therefore, we define additional operators that do not add any
power to the algebra, but simplify common queries

That is to say, for each additional operator, we can give an


equivalent expression that use only the fundamental operators.

4
Additional operators
Set intersection (  )
Natural join ( )
Assignment ()
Left outer join ( ), Right outer join ( )
Division ()
Many more
Θ-join, semi join, anti join, full outer join.
5 5
R S

Set intersection
R  S = R – (R – S)
R and S must have the same number of R-S R – (R - S )
attributes and attribute domains compatible.
Example
Query: Find the employee_id of employees who work
in department 1 and department 3.
Query
SELECT employee_id processer
FROM Works_in
WHERE department_id=1
( πemployee_id ( σdepartment_id=1 ( Works_in ) ) )  ( πemployee_id ( σdepartment_id=3 ( Works_in ) ) )
INTERSECT Relational algebra with set intersection

SELECT employee_id
FROM Works_in ( πemployee_id ( σdepartment_id=1 ( Works_in ) ) ) –
WHERE department_id=3 ( ( πemployee_id ( σdepartment_id=1 ( Works_in ) ) ) – ( πemployee_id ( σdepartment_id=3 ( Works_in ) ) ) )
SQL Equivalence relational algebra with only fundamental operators 6
Set intersection
For your reference,
there is another way to
answer the same query
by joining the
Works_in table with
itself.

SELECT DISTINCT employee_id


FROM Works_in W1, Works_in W2
WHERE Query Works_in
W1.employee_id = W2.employee_id AND processer employee_id department_id
W1.department_id=1 AND 1 1
W2. department_id = 3 2 1
1 3
SQL
πemployee_id (σW1.department_id=1  W2.department_id=3 (σW1.employee_id=W2.employee_id (W1 (Works_in) × W2 (Works_in))))
7
Set intersection
For your reference,
there is another way to
answer the same query
by joining the
Works_in table with
itself. W1 (Works_in) × W2 (Works_in)
W1.employee_id W1.department_id W2.employee_id W2.department_id
1 1 1 1
1 1 2 1
1 1 1 3
2 1 1 1
2 1 2 1
2 1 1 3
1 3 1 1
SELECT DISTINCT employee_id 1 3 2 1
FROM Works_in W1, Works_in W2 1 3 1 3
WHERE Query Works_in
W1.employee_id = W2.employee_id AND processer employee_id department_id
W1.department_id=1 AND 1 1
W2. department_id = 3 2 1
1 3
SQL
πemployee_id (σW1.department_id=1  W2.department_id=3 (σW1.employee_id=W2.employee_id (W1 (Works_in) × W2 (Works_in))))
8
Set intersection
For your reference,
σW1.employee_id=W2.employee_id (W1 (Works_in) × W2 (Works_in))
there is another way to W1.employee_id W1.department_id W2.employee_id W2.department_id
answer the same query 1
1
1
1
1
1
1
3
by joining the 2
1
1
3
2
1
1
1
Works_in table with 1 3 1 3

itself. W1 (Works_in) × W2 (Works_in)


W1.employee_id W1.department_id W2.employee_id W2.department_id
1 1 1 1
1 1 2 1
1 1 1 3
2 1 1 1
2 1 2 1
2 1 1 3
1 3 1 1
SELECT DISTINCT employee_id 1 3 2 1
FROM Works_in W1, Works_in W2 1 3 1 3
WHERE Query Works_in
W1.employee_id = W2.employee_id AND processer employee_id department_id
W1.department_id=1 AND 1 1
W2. department_id = 3 2 1
1 3
SQL
πemployee_id (σW1.department_id=1  W2.department_id=3 (σW1.employee_id=W2.employee_id (W1 (Works_in) × W2 (Works_in))))
9
Set intersection
σW1.department_id=1  W2.department_id=3 (σW1.employee_id=W2.employee_id (W1 (Works_in) × W2 (Works_in)))
W1.employee_id W1.department_id W2.employee_id W2.department_id
1 1 1 3

For your reference,


σW1.employee_id=W2.employee_id (W1 (Works_in) × W2 (Works_in))
there is another way to W1.employee_id W1.department_id W2.employee_id W2.department_id
answer the same query 1
1
1
1
1
1
1
3
by joining the 2
1
1
3
2
1
1
1
Works_in table with 1 3 1 3

itself. W1 (Works_in) × W2 (Works_in)


W1.employee_id W1.department_id W2.employee_id W2.department_id
1 1 1 1
1 1 2 1
1 1 1 3
2 1 1 1
2 1 2 1
2 1 1 3
1 3 1 1
SELECT DISTINCT employee_id 1 3 2 1
FROM Works_in W1, Works_in W2 1 3 1 3
WHERE Query Works_in
W1.employee_id = W2.employee_id AND processer employee_id department_id
W1.department_id=1 AND 1 1
W2. department_id = 3 2 1
1 3
SQL
πemployee_id (σW1.department_id=1  W2.department_id=3 (σW1.employee_id=W2.employee_id (W1 (Works_in) × W2 (Works_in))))
Natural join
Usually, a query that involves a Cartesian product
includes a selection operation on the result of the
Cartesian product.
The selection operation most often requires that all
attributes that are common to the relations that are
involved in the Cartesian product be equated.

r s = πR  S( σr.A1 = s.A1  r.A2 = s.A2  …  r.An = s.An ( r × s ) )


Requires the common attributes to be equated

where R  S = {A1, A2, … , An}

11
Natural join
The schema of R S is R-schema  S-schema
(repeated attributes are removed)
For each pair of tuples tr from R and ts from S,
If tr and ts share the same value over each of the common
attributes in R and S,

Tuple t will be added to the result of R S.

12
Natural join
R S
A B A C Common attributes: R  S = {A}
1 1 1 2
1 2 2 1 Attributes of the resulting relation: R  S = {A,B,C}
2 3

R×S σR.A=S.A( R × S ) πA,B,C( σR.A=S.A( R × S ))


R.A R.B S.A S.C R.A R.B S.A S.C A B C
1 1 1 2 1 1 1 2 1 1 2
1 1 2 1 1 2 1 2 1 2 2
1 2 1 2 2 3 2 1 2 3 1
1 2 2 1
2 3 1 2
2 3 2 1

R S
A B C
1 1 2
equivalent to: 1 2 2
2 3 1 13
Assignment
It is convenient to write a relational-algebra expression
by assigning parts of it to temporary relation variables.

The assignment operator, denoted by “”, works like


assignment operator “=” in programming language.
temp1  a(R)
temp2  a(S) With the assignment
operator, a query
result  temp1 – temp2 can be written as a
sequential program.
temp1, temp2, result are called “relation variable”.
14
Outer join
The outer-join operator is an extension of the join
operation to deal with missing information.
Customer Depositor Customer ⋈ Depositor
customer_id name address account_id customer_id customer_id name address account_id
C1 Kit CB320 A1 C1 C1 Kit CB320 A1
C2 Ben CB326 A1 C2 C2 Ben CB326 A1
C3 Jolly CB311 A2 C2 C2 Ben CB326 A2
C4 Yvonne CB415 A3 C4 C4 Yvonne CB415 A3
A4 C4 C4 Yvonne CB415 A4

Natural join (e.g., joining Customer and Depositor)


result in a table without the information of
Outer join customers (e.g., C3 in this case) who has no account.

Natural join result, plus


The tuples that do not match any tuples from the other
15
side.
Left outer join
R S = ( R ⋈ S )  ( R - R( R ⋈ S )) × { (null, … , null ) }
Customer Depositor Customer ⋈ Depositor
customer_id name address account_id customer_id customer_id name address account_id
C1 Kit CB320 A1 C1 C1 Kit CB320 A1
C2 Ben CB326 A1 C2 C2 Ben CB326 A1
C3 Jolly CB311 A2 C2 C2 Ben CB326 A2
C4 Yvonne CB415 A3 C4 C4 Yvonne CB415 A3
A4 C4 C4 Yvonne CB415 A4

Let’s illustrate why the left outer join is defined as


( R ⋈ S )  ( R - R( R ⋈ S )) × { (null, … , null ) }
through a step-by-step illustration.

16
Left outer join
R S = ( R ⋈ S )  ( R - R( R ⋈ S )) × { (null, … , null ) }
Customer Depositor Customer ⋈ Depositor
customer_id name address account_id customer_id customer_id name address account_id
C1 Kit CB320 A1 C1 C1 Kit CB320 A1
C2 Ben CB326 A1 C2 C2 Ben CB326 A1
C3 Jolly CB311 A2 C2 C2 Ben CB326 A2
C4 Yvonne CB415 A3 C4 C4 Yvonne CB415 A3
A4 C4 C4 Yvonne CB415 A4

Customer - Customer’s attributes (Customer ⋈ Depositor )


customer_id name address
C3 Jolly CB311

Finding missed tuples in the natural join


R - R( R ⋈ S ) is to generate the tuples in R (i.e., Customer)
that are missed in the natural join.
i.e., C3 Jolly in Customer doesn’t have any matched records
in Depositor, we use this part to recover Jolly’s record.
17
Left outer join
R S = ( R ⋈ S )  ( R - R( R ⋈ S )) × { (null, … , null ) }
Customer Depositor Customer ⋈ Depositor
customer_id name address account_id customer_id customer_id name address account_id
C1 Kit CB320 A1 C1 C1 Kit CB320 A1
C2 Ben CB326 A1 C2 C2 Ben CB326 A1
C3 Jolly CB311 A2 C2 C2 Ben CB326 A2
C4 Yvonne CB415 A3 C4 C4 Yvonne CB415 A3
A4 C4 C4 Yvonne CB415 A4

Customer - Customer’s attributes (Customer ⋈ Depositor )


customer_id name address
C3 Jolly CB311

Customer - Customer’s attributes (Customer ⋈ Depositor ) × { (null, … , null ) }


customer_id name address account_id
C3 Jolly CB311 null

Constructing the missed tuple by adding null value to extra attributes


The “× { (null, … , null ) }” part simply add back the remaining column
values as null because there is no matched records in S (i.e., Depositor).
18
Left outer join
R S = ( R ⋈ S )  ( R - R( R ⋈ S )) × { (null, … , null ) }
Customer Depositor Customer ⋈ Depositor
customer_id name address account_id customer_id customer_id name address account_id
C1 Kit CB320 A1 C1 C1 Kit CB320 A1
C2 Ben CB326 A1 C2 C2 Ben CB326 A1
C3 Jolly CB311 A2 C2 C2 Ben CB326 A2
C4 Yvonne CB415 A3 C4 C4 Yvonne CB415 A3
A4 C4 C4 Yvonne CB415 A4

Customer - Customer’s attributes (Customer ⋈ Depositor )


customer_id name address
C3 Jolly CB311

Customer - Customer’s attributes (Customer ⋈ Depositor ) × { (null, … , null ) }


customer_id name address account_id
C3 Jolly CB311 null
(Customer ⋈ Depositor)  (Customer - Customer’s attributes (Customer ⋈ Depositor )) × { (null, … , null ) }
customer_id name address account_id
C1 Kit CB320 A1
C2 Ben CB326 A1
C2 Ben CB326 A2
C3 Jolly CB311 null equivalent to: Customer Depositor
C4 Yvonne CB415 A3
C4 Yvonne CB415 A4 19
Right outer join
R S = ( R ⋈ S )  ( S - S( R ⋈ S )) × { (null, … , null ) }
Customer Depositor Customer ⋈ Depositor
customer_id name address account_id customer_id customer_id name address account_id
C1 Kit CB320 A1 C1 C1 Kit CB320 A1
C2 Ben CB326 A1 C2 C2 Ben CB326 A1
C3 Jolly CB311 A2 C2 C2 Ben CB326 A2
C4 Yvonne CB415 A3 C4 C4 Yvonne CB415 A3
A4 C5

Let’s illustrate why the right outer join is


defined as
( R ⋈ S )  ( S - S( R ⋈ S )) × { (null, … , null ) }
through a step-by-step illustration.

20
Right outer join
R S = ( R ⋈ S )  ( S - S( R ⋈ S )) × { (null, … , null ) }
Customer Depositor Customer ⋈ Depositor
customer_id name address account_id customer_id customer_id name address account_id
C1 Kit CB320 A1 C1 C1 Kit CB320 A1
C2 Ben CB326 A1 C2 C2 Ben CB326 A1
C3 Jolly CB311 A2 C2 C2 Ben CB326 A2
C4 Yvonne CB415 A3 C4 C4 Yvonne CB415 A3
A4 C5

Depositor - Depositor’s attributes (Customer ⋈ Depositor )


customer_id account_id
C5 A4

Finding missed tuples in the natural join


S - S( R ⋈ S ) is to generate the tuples in S (i.e.,
Depositor) that are missed in the natural join.
i.e., C5 in Depositor doesn’t have any matched records
in Customer, we use this part to recover C5’s record.
21
Right outer join
R S = ( R ⋈ S )  ( S - S( R ⋈ S )) × { (null, … , null ) }
Customer Depositor Customer ⋈ Depositor
customer_id name address account_id customer_id customer_id name address account_id
C1 Kit CB320 A1 C1 C1 Kit CB320 A1
C2 Ben CB326 A1 C2 C2 Ben CB326 A1
C3 Jolly CB311 A2 C2 C2 Ben CB326 A2
C4 Yvonne CB415 A3 C4 C4 Yvonne CB415 A3
A4 C5

Depositor - Depositor’s attributes (Customer ⋈ Depositor )


customer_id account_id
C5 A4
Depositor - Depositor’s attributes (Customer ⋈ Depositor ) × { (null, … , null ) }
customer_id name address account_id
C5 null null A4

Constructing the missed tuple by adding null value to extra attributes


The “× { (null, … , null ) }” part simply add back the remaining column
values as null because there is no matched records in R (i.e., Customer).
22
Right outer join
R S = ( R ⋈ S )  ( S - S( R ⋈ S )) × { (null, … , null ) }
Customer Depositor Customer ⋈ Depositor
customer_id name address account_id customer_id customer_id name address account_id
C1 Kit CB320 A1 C1 C1 Kit CB320 A1
C2 Ben CB326 A1 C2 C2 Ben CB326 A1
C3 Jolly CB311 A2 C2 C2 Ben CB326 A2
C4 Yvonne CB415 A3 C4 C4 Yvonne CB415 A3
A4 C5

Depositor - Depositor’s attributes (Customer ⋈ Depositor )


customer_id account_id
C5 A4
Depositor - Depositor’s attributes (Customer ⋈ Depositor ) × { (null, … , null ) }
customer_id name address account_id
C5 null null A4

(Customer ⋈ Depositor)  (Depositor - Depositor’s attributes (Customer ⋈ Depositor )) × { (null, … , null ) }


customer_id name address account_id
C1 Kit CB320 A1
C2 Ben CB326 A1
C2 Ben CB326 A2 equivalent to: Customer Depositor
C4 Yvonne CB415 A3
C5 null null A4 23
R S

Division A
1
2
2
B
1
1
2
B
1
2

3 3 RS
Notation: R  S 4
4
1
2
A
2
4 3 4
Definition The attributes in relation S is a subset of
Let S  R the attributes in relation R.

R  S = { t | t  R-S(R)  (  s  S, ( (t  s)  R) )}
R-S(R)
Condition 1. A resulting A Condition 2. And if we combine t with each
tuple t has to be in the 1
tuple s  S, all the combined tuples have to be
relation R-S(R) 2
3 included in R.
4

1 For each tuple t  R-S(R) 2 “ s  S, t  s” part: 3 “ s  S, ( (t  s)  R)” part:


Generate tuples by union t with Then we check if both tuples
all tuples s  S generated are in R.
A B In this case, not both tuples are in
A  s  S, ( t1  s ) 1 1
t1  R-S(R) 1 1 2
R, so t1 is NOT in result of R  S

A B
In this case, both tuples are in R,
t2  R-S(R) A  s  S, ( t2  s ) 2 1
so t2 is in result of R  S
2 2 2 24
R S

Division A
1
2
2
B
1
1
2
B
1
2

3 3 RS
4 1 A
4 2 2
Observation 4 3 4
Let’s focus on the result of R  S (say, the tuple A=2), it means that
For the tuples with values in attribute A equals to 2 in relation R,
Those tuples’s values in attribute B covers ALL values in attribute B of S.

Division is used to express queries with “all”


Find the IDs of all students who have taken all CS courses
(dpt_id = 1).
Student Takes Course
student_id name dpt_id student_id course_id Grade course_id title dpt_id credit
1 Peter 1 1 1 A 1 Intro to DB 1 6
2 Sharon 1 1 2 B 2 Programming I 1 6
3 David 2 1 3 A+ 3 Accounting 2 6
4 Joe 3 2 3 B-
3 3 B
4 1 C 25
4 2 A-
Division
Step 1. All part (the relation S): What is the course ID
of all CS courses (dpt_id = 1)?
course_id (σ dpt_id=1(Course) ) σ dpt_id=1(Course)
course_id title dpt_id credit
course_id 1 Intro to DB 1 6
1 2 Programming I 1 6
2

Student Takes Course


student_id name dpt_id student_id course_id Grade course_id title dpt_id credit
1 Peter 1 1 1 A 1 Intro to DB 1 6
2 Sharon 1 1 2 B 2 Programming I 1 6
3 David 2 1 3 A+ 3 Accounting 2 6
4 Joe 3 2 3 B-
3 3 B
4 1 C 26
4 2 A-
Division
Step 2. Dividend part (the relation R): What is the list of
Takes tuples (i.e., all < student_id, course_id > pairs)?
 student_id,course_id (Takes) course_id (σ dpt_id=1(Course) )
student_id course_id course_id
1 1 1
1 2 2
1 3
2 3
3 3
4 1
4 2

Student Takes Course


student_id name dpt_id student_id course_id Grade course_id title dpt_id credit
1 Peter 1 1 1 A 1 Intro to DB 1 6
2 Sharon 1 1 2 B 2 Programming I 1 6
3 David 2 1 3 A+ 3 Accounting 2 6
4 Joe 3 2 3 B-
3 3 B
4 1 C 27
4 2 A-
Division
Step 3. Division: Which student in  student_id,course_id
(Takes) takes all CS courses course_id (σ dpt_id=1(course) ) ?.
 student_id,course_id (Takes) course_id (σ dpt_id=1(Course) )  student_id,course_id (Takes)  course_id (σ dpt_id=1(Course) )
student_id course_id course_id student_id
1 1 1 1
1 2 2 4
1 3
2 3
3 3
4 1
4 2

Explanation: Let’s focus on student_id = 1 in the result


It means that, for the tuples in Takes with student_id=1,
Those tuple’s value in course_id attribute
covers all 1 and 2 (i.e., the CS courses).
That is to say, student with student_id 1 takes ALL CS courses.
The same argument apples to student_id 4. 28
Division
Division has a property: R  S  S = R
R S
A B C D
1 1 1 2
1 2 2 2
2 3

R×S (R × S)  S
A B C D A B
1
1
1
1
1
2
1
2
1
2
2
2
1
1
2
1
2
3
=R
1 2 2 2
2 3 1 2
2 3 2 2

Self-study question
How about (R  S)  S = R ?
Is this true in relational algebra?
Can you prove/ disprove it? 29
Section 2

Extended
operators

Slides prepared by - Dr. Chui Chun Kit, http://www.cs.hku.hk/~ckchui/ for students in COMP3278
For other uses, please email : [email protected]
Aggregation
Aggregation function takes a collection of values and
returns a single-valued result.
e.g., avg, min, max, sum, count, count-distinct

Aggregate operation in relational algebra:


Grouping - Divides the tuples into groups.

Aggregation - Computes an aggregation function in


each group to create a result tuple.

31
Aggregation
G1, G2, …, Gn
g F1(A1), F2(A2), …, Fn(An)
(E)
E – a relation (can be a result of relational algebra
expression).
G1, …, Gn – attributes used to form groups.
Tuples with the same values in G1 to Gn are put into the
same group.
G can be empty, which means that the whole relation is
one group.
Fi(Ai) – an aggregate function applied on an attribute.
32
Aggregation
Account Step 1. Let’s group the tuples in
branch_id account_id balance Account according to their
B1 A1 500
B2 A2 400 branch_id.
B2 A3 900
B1 A4 700 Step 2. Then aggregate the tuples
in each group by summing their
values in the balance attribute.
Result(branch_id, sum_of_balance) (
Step 3. Since the resulting relation
branch_id g sum(balance) (Account) has no name after aggregation,
) we use renaming operator to give
name to the relation and
attributes.
Result
branch_id sum_of_balance
B1 1200
B2 1300

33
Aggregation
Student
Note that grouping can be student_id name dpt_id gender
1 Peter 1 M
done on multiple attributes. 2 Sharon 1 F
3 David 2 M
4 Joe 2 M
E.g., in this case, we group 5 Betty 1 F
tuples in Student with the
same values in both dpt_id Result(dpt_id, gender, count) (
and gender attributes.
dpt_id, gender g count() (Student)
i.e., We are finding the )
number of male / female
students in each department.
Result
dpt_id gender count
1 M 1
1 F 2
2 M 2

34
Section 3

Algebraic
properties

Slides prepared by - Dr. Chui Chun Kit, http://www.cs.hku.hk/~ckchui/ for students in COMP3278
For other uses, please email : [email protected]
Transformation of expression
A query can be expressed in several different ways,
with different cost of evaluation.

Two relational-algebra expressions are said to be


equivalent if, on every legal database instance, the
two expressions generate the same set of tuples.

In the following discussions, we treat a relation as a


set of tuples, the order of the tuples is irrelevant.

36
Equivalence rules
Rule 1. Only the final operations in a sequence of
projection operations are needed.
πL1 (πL2 (…(πLn ( E ))…)) = πL1 (E)
πemployee_id(πemployee_id,salary(Employee))
Expression tree:
employee_id π employee_id (
1
πemployee_id,salary( Employee )
2 Tells which operator is
3 )
Regular expression
executed ahead of another.
πemployee_id,salary(Employee)
employee_id name π employee_id It allow transformation of
1 Jones the execution order by
2 Smith applying the equivalence rules.
3 Smith πemployee_id,salary
Employee
(Alter the tree)
employee_id name salary
1 Jones 26000 Employee
2 Smith 28000
3 Smith 24000 Expression tree 37
Equivalence rules
Rule 1. Only the final operations in a sequence of
projection operations are needed.
πL1 (πL2 (…(πLn ( E ))…)) = πL1 (E)

π employee_id (
πemployee_id,salary( Employee ) equivalent π employee_id ( Employee )
) to Regular expression
πemployee_id(Employee)
Regular expression
employee_id
1 π employee_id
2 Remove it
3

πemployee_id,salary π employee_id
Employee equivalent
employee_id name salary to
1 Jones 26000 Employee Employee
2 Smith 28000
3 Smith 24000 Expression tree Transformed expression tree 38
Equivalence rules
Rule 2. Conjunctive selection operations can be
deconstructed into a sequence of individual selections.
σp1  p2( E ) = σp1 (σp2 ( E ) )
σ name="Smith" ( σ name="Smith" (σ salary>24000 ( Employee ))
σname="Smith"  salary>24000( Employee ) σ salary>24000 ( Employee ) employee_id name salary
) 2 Smith 28000
σ name="Smith"
σ name="Smith"  salary>24000
equivalent σ salary>24000(Employee)
σ salary>24000
Employee to employee_id name salary
1 Jones 26000
Employee 2 Smith 28000

You may wonder why breaking down the conjunctive selections Employee
is useful. employee_id name salary
We will show that it is useful to reduce temporary result. 1 Jones 26000
We can try to push each selection predicates down the tree 2 Smith 28000
(to perform selection as early as possible). 3 Smith 24000
39
Equivalence rules
Rule 3. Selection operations are commutative.
σp1 (σp2 ( E ) ) = σp2 (σp1 ( E ) )
σ name="Smith" (σ salary>24000 ( Employee ))
σ salary>24000(σ name="Smith" ( Employee ))
employee_id name salary
σ name="Smith" ( σ salary>24000 ( employee_id name salary
2 Smith 28000 2 Smith 28000
σ salary>24000(Employee) σ name="Smith"(Employee)
) )
σ salary>24000(Employee)
σ name="Smith" σ salary>24000 σ name="Smith"(Employee)
employee_id name salary
employee_id name salary
1 Jones 26000 σ salary>24000 σ name="Smith" 2 Smith 28000
2 Smith 28000
3 Smith 24000
4 David 25000
Employee Employee
Employee Employes
employee_id name salary Note that the two executions employee_id name salary
1 Jones 26000 have different costs. In particular, 1 Jones 26000
2 Smith 28000 the size of their temporary 2 Smith 28000
3 Smith 24000 3 Smith 24000
4 David 25000 relations are different. 4 David 25000
Equivalence rules
Rule 4. Natural join operations are associative.
( E1⋈ E2 ) ⋈ E3 = E1⋈ ( E2 ⋈ E3 )

(Employee ⋈ Works_in) ⋈ Department Employee ⋈ ( Works_in ⋈ Department )

⋈ ⋈

equivalent Employee ⋈
⋈ Department
to
Employee Works_in Works_in Department

Although both expression trees return the same


resulting relation, these two expression trees have
different costs because the size of their temporary
relations are different
41
Equivalence rules
(Employee ⋈ Works_in) ⋈ Department
employee_id Employee. salary department_id Department.

Expression tree A. 1
name
Jones 26000 1
name
Toys
2 Smith 28000 1 Toys
2 Smith 28000 2 Tools
(Employee ⋈ Works_in) ⋈ Department 3 Parker 35000 3 Food
4 Smith 24000 3 Food

⋈ Employee ⋈ Works_in
employee_id name salary department_id
1 Jones 26000 1
⋈ Department 2 Smith 28000 1
2 Smith 28000 2
3 Parker 35000 3
4 Smith 24000 3
Employee Works_in

Natural join evaluates 4*5 = 20 combinations


result temporary relation consists of 5 tuples and 4 columns.

Employee Works_in Department


employee_id name salary employee_id department_id department_id name
1 Jones 26000 1 1 1 Toys
2 Smith 28000 2 1 2 Tools
3 Parker 35000 2 2 3 Food
4 Smith 24000 3 3 42
4 3
Equivalence rules
Employee ⋈ ( Works_in ⋈ Department )
employee_id Employee. salary department_id Department.

Expression tree B. 1
name
Jones 26000 1
name
Toys
2 Smith 28000 1 Toys
2 Smith 28000 2 Tools
Employee ⋈ ( Works_in ⋈ Department ) 3 Parker 35000 3 Food
4 Smith 24000 3 Food

Works_in ⋈ Department
⋈ employee_id department_id name
1 1 Toys
2 1 Toys
Employee ⋈ 2 2 Tools
3 3 Food
4 3 Food
Works_in Department
Natural join evaluates 5*3 = 15 combinations
result temporary relation consists of 5 tuples and 3 columns.

Employee Works_in Department


employee_id name salary employee_id department_id department_id name
1 Jones 26000 1 1 1 Toys
2 Smith 28000 2 1 2 Tools
3 Parker 35000 2 2 3 Food
4 Smith 24000 3 3 43
4 3
Equivalence rules
Rule 5. The selection operation distributes over the
natural join operation under the following two conditions.
Rule 5a. It distributes when all the attributes in selection
condition involve only the attributes of one of the expressions
(say, E1) being joined. σp ( E1⋈ E2) = (σp ( E1 ) ⋈ E2)
σ Employee.name="Smith"(
Employee ⋈ Works_in (σ Employee.name="Smith" (Employee) ⋈ Works_in)
)
equivalent
σ Employee.name="Smith" ⋈
to

σ Employee.name="Smith" Works_in

Employee Works_in Employee 44


Equivalence rules
σ Employee.name="Smith“( Employee ⋈ Works_in )

Expression tree A. employee_id


2
name
Smith
salary
28000
department_id
1
2 Smith 28000 2
σ Employee.name="Smith"( 4 Smith 24000 3
Employee ⋈ Works_in
)
Employee ⋈ Works_in
employee_id name salary department_id
σ Employee.name="Smith" 1 Jones 26000 1
2 Smith 28000 1
2 Smith 28000 2
⋈ 3 Parker 35000 3
4 Smith 24000 3

Employee Works_in
Natural join evaluates 4*5 = 20 combinations
result temporary relation consists of 5 tuples and 4 columns.

Employee Works_in
employee_id name salary employee_id department_id
1 Jones 26000 1 1
2 Smith 28000 2 1
3 Parker 35000 2 2
4 Smith 24000 3 3 45
4 3
Equivalence rules
σ Employee.name="Smith" (Employee) ⋈ Works_in
employee_id name salary department_id
Expression tree B. 2 Smith 28000 1
2 Smith 28000 2
4 Smith 24000 3

σ Employee.name="Smith" (Employee) ⋈ Works_in

Natural join evaluates 2*5 = 10 combinations


result temporary relation consists of 3 tuples and 4 columns.

σ Employee.name="Smith" (Employee)
σ Employee.name="Smith" Works_in
employee_id name salary
2 Smith 28000
Employee 4 Smith 24000

When comparing with the equivalence


expression tree 1, we can see that if we push
Employee Works_in the selection predicates down the natural
employee_id name salary employee_id department_id join (perform selection earlier than joining),
1 Jones 26000 1 1 then we will probably have a smaller
2 Smith 28000 2 1
3 Parker 35000 2 2 temporary relation.
4 Smith 24000 3 3
4 3 46
Equivalence rules
Rule 5b. The selection distributes when selection condition p1
involves only the attributes of E1 and p2 involves only the
attributes of E2.
σp1  p2( E1⋈ E2) = (σp1 ( E1 ) ⋈ σp2 ( E2 ))
σ Employee.name="Smith"  Works_in.department_id=3( ( σ Employee.name="Smith" (Employee)
Employee ⋈ Works_in ⋈
) σ Works_in.department_id=3 ( Works_in ) )

equivalent
σ Employee.name="Smith"  Works_in.department_id=3 ⋈
to
⋈ σ Employee.name="Smith" σ Works_in.department_id=3

Works_in Employee Works_in


Employee

47
Equivalence rules
Expression tree A. σ Employee.name="Smith"  Works_in.department_id=3(Employee ⋈ Works_in)
employee_id name salary department_id
σ Employee.name="Smith"  Works_in.department_id=3( 4 Smith 24000 3
Employee ⋈ Works_in
)
Employee ⋈ Works_in
employee_id name salary department_id
σ Employee.name="Smith"  Works_in.department_id=3
1 Jones 26000 1
2 Smith 28000 1
2 Smith 28000 2
⋈ 3 Parker 35000 3
4 Smith 24000 3

Employee Works_in

Natural join evaluates 4*5 = 20 combinations

Employee Works_in
employee_id name salary employee_id department_id
1 Jones 26000 1 1
2 Smith 28000 2 1
3 Parker 35000 2 2
4 Smith 24000 3 3 48
4 3
Equivalence rules
( σ Employee.name="Smith" (Employee) ⋈ σ Works_in.department_id=3(Works_in) )
Expression tree B. employee_id name salary department_id
4 Smith 24000 3

( σ Employee.name="Smith" (Employee)
⋈ Natural join evaluates 4 combinations
σ Works_in.department_id=3 ( Works_in ) )

⋈ σ Employee.name="Smith"( Employee ) σ Works_in.department_id=3(Works_in)

employee_id name salary employee_id department_id


σ Employee.name="Smith" σ Works_in.department_id=3 2 Smith 28000 3 3
4 Smith 24000 4 3

Employee Works_in
When comparing with the equivalence
expression 1, we can see that if we
Employee Works_in push the selection predicates down the
employee_id name salary employee_id department_id
natural join (perform selection earlier
1 Jones 26000 1 1 than joining), the natural join would
2
3
Smith
Parker
28000
35000
2
2
1
2
consider fewer combinations.
4 Smith 24000 3 3
4 3 49
Equivalence rules
Rule 6. The projection operation can distribute over the
natural join operation.

πL1  L2 ( E1⋈ E2) = πL1  L2 ( (πL1  L3 (E1)) ⋈ (πL2  L3 (E2)) )

Let L1 and L2 be some attributes from E1 and E2,respectively.

Let L3 be attributes that are involved in join condition, but are


not in L1  L2.

50
Equivalence rules
πL1  L2 ( E1⋈ E2) = πL1  L2 ( (πL1  L3 (E1)) ⋈ (πL2  L3 (E2)) )

π Empoyee.name, Works_in.department_id (Employee ⋈ Works_in) π Empoyee.name, Works_in.department_id

L1 = Empoyee.name Employee Works_in


L2 = Works_in.department_id
L3 = employee_id equivalent to
The attribute used in natural join πEmpoyee.name,Works_in.department_id

π Empoyee.name, Works_in.department_id (

π Empoyee.name, Employee.employee_id (Employee)

πEmpoyee.name, πWorks_in.department_id,
π Works_in.department_id, Works_in.employee_id(Works_in)
Employee.employee_id Works_in.employee_id
)

Employee Works_in

51
Equivalence rules
π Empoyee.name, Works_in.department_id (Employee ⋈ Works_in)
Expression tree A. name
Jones
department_id
1
Smith 1
Smith 2
π Empoyee.name, Works_in.department_id Parker 3
Smith 3
(Employee ⋈ Works_in)
Employee ⋈ Works_in
π Empoyee.name, Works_in.department_id employee_id name salary department_id since
1 Jones 26000 1 2012/1/1
⋈ 2 Smith 28000 1 2011/3/2
2 Smith 28000 2 2014/2/1
3 Parker 35000 3 2013/2/2
4 Smith 24000 3 2013/2/8
Employee Works_in

Natural join evaluates 4*5 = 20 combinations,


result temporary relation consists of 5 columns.
Employee Works_in
employee_id name salary employee_id department_id since
1 Jones 26000 1 1 2012/1/1
2 Smith 28000 2 1 2011/3/2
3 Parker 35000 2 2 2014/2/1
4 Smith 24000 3 3 2013/2/2 52
4 3 2013/2/8
Equivalence rules
π Empoyee.name, Works_in.department_id ( πEmpoyee.name, Employee.employee_id( Employee ) ⋈
Expression tree B. πWorks_in.department_id, Works_in.employee_id (Works_in) )
name department_id
π Empoyee.name, Works_in.department_id ( Jones 1
Smith 1
π Empoyee.name, Employee.employee_id (Employee) Smith 2
⋈ Parker 3
Smith 3
π Works_in.department_id, Works_in.employee_id(Works_in)
) πEmpoyee.name, Employee.employee_id( Employee ) ⋈
πWorks_in.department_id, Works_in.employee_id (Works_in)
πEmpoyee.name,Works_in.department_id employee_id name department_id
1 Jones 1
2 Smith 1
⋈ 2 Smith 2
3 Parker 3
4 Smith 3
πEmpoyee.name, πWorks_in.department_id,
Natural join evaluates 4*5 = 20 combinations,
Employee.employee_id Works_in.employee_id
result temporary relation consists of 3 columns.

Employee Works_in πEmpoyee.name, πWorks_in.department_id,


Employee.employee_id( Employee ) Works_in.employee_id (Works_in)
Employee Works_in
employee_id name salary employee_id department_id since employee_id name employee_id department_id
1 Jones 26000 1 1 2012/1/1 1 Jones 1 1
2 Smith 28000 2 1 2011/3/2 2 Smith 2 1
3 Parker 35000 2 2 2014/2/1 3 Parker 2 2
4 Smith 24000 3 3 2013/2/2 4 Smith 3 3
53 4 3 2013/2/8 4 3
Equivalence rules
Rule 7. The set operations union and intersections
are commutative.
E 1  E2 = E 2  E1

E 1  E2 = E 2  E1

E1 E2
The set different operation is NOT
commutative
E1 - E2 ≠ E2 - E1

E1 - E2 ≠ E2 – E1
54
Equivalence rules
Rule 8. The set operations union and intersections
are associative.
(E1  E2)  E3 = E1  ( E2  E3 )

(E1  E2)  E3 = E1  ( E2  E3 )
The set different operation is NOT associative.
E3 E3
(E1 - E2) - E3
(E1 - E2) - E3 ≠ E1 - ( E2 - E3 )

E1 - ( E2 - E3 )

E1 E2 E1 E2
55
Equivalence rules
Rule 9. The selection operation distributes over the
union, intersection and set difference operations
σp ( E1  E2 ) = σp ( E1 )  σp ( E2 )

σp ( E1  E2 ) = σp ( E1 )  σp ( E2 )
σp ( E1 - E2 ) = σp ( E1 ) - σp ( E2 )
Audio_CD σ stock<10 (
ID name provider_id stock #tracks π name, provider_id, stock ( Audio_CD ) 
CD1 One Heart P1 55 14 π name, provider_id, stock ( DVD )
CD2 Miracle P2 4 14 )
DVD
equivalent to
ID name provider_id stock length
DVD1 Prince of Persia P2 3 110 σ stock<10 (π name, provider_id, stock ( Audio_CD )) 
DVD2 Iron man 3 P3 60 90 σ stock<10 (π name, provider_id, stock ( DVD ))
DVD3 Legend is born: Ip Man P3 17 90
56
Equivalence rules
σ stock<10 ( σ stock<10 (π name, provider_id, stock ( Audio_CD )  π name, provider_id, stock ( DVD ) )
π name, provider_id, stock ( Audio_CD )  name provider_id stock
π name, provider_id, stock ( DVD ) Miracle P2 4
Prince of Persia P2 3
)

σ stock<10 π name, provider_id, stock ( Audio_CD )  π name, provider_id, stock ( DVD )


name provider_id stock
 One Heart P1 55
Miracle P2 4
Prince of Persia P2 3
π name, provider_id, stock π name, provider_id, stock Iron man 3 P3 60
Legend is born: Ip Man P3 17

Audio_CD DVD

Audio_CD π name, provider_id, stock (Audio_CD)


ID name provider_id stock #tracks name provider_id stock
CD1 One Heart P1 55 14 One Heart P1 55
CD2 Miracle P2 4 14 Miracle P2 4
DVD π name, provider_id, stock (DVD)
ID name provider_id stock length name provider_id stock
DVD1 Prince of Persia P2 3 110 Prince of Persia P2 3
DVD2 Iron man 3 P3 60 90 Iron man 3 P3 60
DVD3 Legend is born: Ip Man P3 17 90 Legend is born: Ip Man P3 17
57
Equivalence rules
σ stock<10 (π name, provider_id, stock ( Audio_CD ))  σ stock<10 (π name, provider_id, stock ( DVD ))
name provider_id stock
σ stock<10 (π name, provider_id, stock ( Audio_CD ))  Miracle P2 4
σ stock<10 (π name, provider_id, stock ( DVD )) Prince of Persia P2 3

σ stock<10 (π name, provider_id, stock ( Audio_CD ))



name provider_id stock
Miracle P2 4
σ stock<10 σ stock<10
σ stock<10 (π name, provider_id, stock ( DVD ))
π name, provider_id, stock π name, provider_id, stock name provider_id stock
Prince of Persia P2 3
Audio_CD DVD

Audio_CD π name, provider_id, stock (Audio_CD)


ID name provider_id stock #tracks name provider_id stock
CD1 One Heart P1 55 14 One Heart P1 55
CD2 Miracle P2 4 14 Miracle P2 4
DVD π name, provider_id, stock (DVD)
ID name provider_id stock length name provider_id stock
DVD1 Prince of Persia P2 3 110 Prince of Persia P2 3
DVD2 Iron man 3 P3 60 90 Iron man 3 P3 60
DVD3 Legend is born: Ip Man P3 17 90 Legend is born: Ip Man P3 17
58
Equivalence rules
Rule 10. The projection operation distributes over
the union operation

πL ( E1  E2 ) = πL ( E1 )  πL ( E2 )

Projection does not distribute over intersection and set


difference.

πL ( E 1  E2 ) ≠ πL ( E 1 )  π L ( E2 )

πL ( E 1 - E2 ) ≠ πL ( E1 ) - πL ( E2 )
59
Equivalence rules
Why projection does not distribute over
intersection and set difference?

To show that projection does not distribute over


intersection, you only need to provide a counter example.
RS πA ( R  S )
A B A

Empty set Empty set


This counter example shows that


R πA ( R ) πA ( R  S ) ≠ πA ( R )  πA ( S )
A B A
1 1 1 πA ( R )  πA ( S ) Now, can you try to construct a
A
counter example to show that
S πA ( S ) projection does not distribute over
1
A B A set difference?
1 2 60
1
Equivalence rules
Note that the equivalence rules listed are just a
partial list of equivalences.

Please refer to the textbook for more equivalences involving


extended relational operators, such as the outer join and
aggregation.

61
Section 4

Example of
query optimization

Slides prepared by - Dr. Chui Chun Kit, http://www.cs.hku.hk/~ckchui/ for students in COMP3278
For other uses, please email : [email protected]
Transformation
Find the names of all instructors in the CS department (dpt_id =
1) who have taught a course in 2nd semester, together with the
course title of all the courses that the instructors teach.

SELECT I.name, C.title πI.name,C.title (


FROM Instructor I, Teaches T, Course C σ I.dpt_id =1  T.sem=2(
WHERE I.dpt_id = 1 AND I( Instructor) ⋈ ( T( Teaches) ⋈ C( Course) )
T.sem=2 AND )
I.instructor_id = T.instructor_id AND )
T.course_id = C.course_id ; Relational algebra
SQL

Instructor Teaches Course


instructor_id name dpt_id instructor_id course_id sem course_id title credit
1 Kit 1 1 1 1 1 Intro to DB 6
2 Ben 1 1 2 2 2 Programming I 6
3 Michael 2 2 4 1 3 Accounting 6 63
4 William 3 3 3 2 4 Algorithms 6
Transformation
πI.name,C.title ( πI.name,C.title (
σ I.dpt_id =1  T.sem=2( σ I.dpt_id =1  T.sem=2(
I( Instructor) ⋈ (T( Teaches) ⋈ C( Course)) equivalent I( Instructor) ⋈
) to ( T( Teaches) ⋈ π C.course_id, C.title (C( Course)))
) )
)
Rule 6
BEFORE πL1  L2 ( E1⋈ E2) = AFTER
πL1  L2 ( (πL1  L3 (E1)) ⋈ (πL2  L3 (E2)) )
πI.name,C.title
πI.name,C.title
Let’s try to push the projection σ I.dpt_id=1  T.sem=2
πC.title downward and apply it
σ I.dpt_id=1  T.sem=2
ahead of the natural joins.

⋈ Since this natural join requires
C.course_id = T.course_id, I ⋈
I ⋈ therefore, we have to add the
joining attribute C.course_id to T πC.course_id,C.title
T C make the projection
πC.course_id,C.title
C
64
Transformation
πI.name,C.title ( πI.name,C.title (
σ I.dpt_id =1  T.sem=2( σ I.dpt_id =1  T.sem=2(
( I( Instructor) ⋈ T( Teaches) ) I( Instructor) ⋈
equivalent
⋈ π C.course_id, C.title ( C( Course)) ( T( Teaches) ⋈ π C.course_id, C.title (C( Course)))
) to )
) )
Rule 4
BEFORE
AFTER ( E1⋈ E2 ) ⋈ E3 = E1⋈ ( E2 ⋈ E3 )
πI.name,C.title
πI.name,C.title Now we would like to push
the selection down to reduce σ I.dpt_id=1  T.sem=2
σ I.dpt_id=1  T.sem=2 the size of the temporary result
of the natural join.
As the selection involves ⋈

relations I and T only, we would
like to rearrange the natural I ⋈
⋈ πC.course_id,C.title joins to make I and T under one
natural join. T πC.course_id,C.title
I T C Since natural joins are
associative, we can make such
C
rearrangement. 65
Transformation
πI.name,C.title ( πI.name,C.title (
σ I.dpt_id =1  T.sem=2( ( σ I.dpt_id =1  T.sem=2(
( I( Instructor) ⋈ T( Teaches) ) I( Instructor) ⋈ T( Teaches)
equivalent
⋈ π C.course_id, C.title ( C( Course)) )
) to )
) ⋈ π C.course_id, C.title ( C( Course))
)
Rule 5a
BEFORE σp1 ( E1⋈ E2) = (σp1 ( E1 ) ⋈ E2 ) AFTER
πI.name,C.title πI.name,C.title
Now we can push the
σ I.dpt_id=1  T.sem=2 selection down one level.

According to Rule 5a,
⋈ we can distribute both
selection predicates to σ I.dpt_id=1  T.sem=2 πC.course_id,C.title
⋈ πC.course_id,C.title the L.H.S. of the selection
as the R.H.S. does not
⋈ C
contain any attribute in
I T C the selection predicate.
I T
66
Transformation
πI.name,C.title ( πI.name,C.title (
( σ I.dpt_id =1 ( I( Instructor) ) ( σ I.dpt_id =1  T.sem=2(
⋈ I( Instructor) ⋈ T( Teaches)
equivalent
σ T.sem=2 ( T( Teaches) ) )
) to )
⋈ π C.course_id, C.title ( C( Course) ) ⋈ π C.course_id, C.title ( C( Course))
) )

AFTER Rule 5b
σp1  p2( E1⋈ E2) = (σp1 ( E1 ) ⋈ σp1 ( E2 )) BEFORE
πI.name,C.title πI.name,C.title
Now we can further
push the selection one
⋈ more level down by ⋈
applying rule 5b.
⋈ πC.course_id,C.title According to Rule 5b,
we can distribute σ I.dpt_id=1  T.sem=2 πC.course_id,C.title
- σ I.dpt_id=1 to I
σ I.dpt_id=1 σ T.sem=2 C - σ T.sem=2 to T ⋈ C

I T
I T
67
Illustration (original tree)
πI.name,C.title ( I( Instructor) ⋈ ( T( Teaches) ⋈ C( Course) )
σ I.dpt_id =1  T.sem=2( instructor_id name dpt_id course_id sem title credit
I( Instructor) ⋈ 1 Kit 1 1 1 Intro to DB 6
( T( Teaches) ⋈ C( Course) ) 1 Kit 1 2 2 Programming I 6
) 2 Ben 1 4 1 Algorithms 6
) 3 Michael 2 3 2 Accounting 6

Natural join evaluates 4*4 = 16 combinations.

T( Teaches) ⋈ C( Course)


instructor_id name dpt_id course_id sem
1 Kit 1 1 1
⋈ 1 Kit 1 2 2
2 Ben 1 4 1
I ⋈ 3 Michael 2 3 2

T C Natural join evaluates 4*4 = 16 combinations.

Instructor Teaches Course


instructor_id name dpt_id instructor_id course_id sem course_id title credit
1 Kit 1 1 1 1 1 Intro to DB 6
2 Ben 1 1 2 2 2 Programming I 6
3 Michael 2 2 4 1 3 Accounting 6
4 William 3 3 3 2 4 Algorithms 6 68
Illustration (original tree)
πI.name,C.title ( I( Instructor) ⋈ ( T( Teaches) ⋈ C( Course) )
σ I.dpt_id =1  T.sem=2( instructor_id name dpt_id course_id sem title credit
I( Instructor) ⋈ 1 Kit 1 1 1 Intro to DB 6
( T( Teaches) ⋈ C( Course) ) 1 Kit 1 2 2 Programming I 6
) 2 Ben 1 4 1 Algorithms 6
) 3 Michael 2 3 2 Accounting 6

πI.name,C.title

σ I.dpt_id=1  T.sem=2 σ I.dpt_id =1  T.sem=2( I( Instructor) ⋈ ( T( Teaches) ⋈ C( Course) ))
instructor_id name dpt_id course_id sem title credit
⋈ 1 Kit 1 2 2 Programming I 6

I ⋈
πI.name,C.title ( σ I.dpt_id =1  T.sem=2( I( Instructor) ⋈ ( T( Teaches) ⋈ C( Course) )))
T C
name title
Kit Programming I

69
Illustration (transformed tree)
πI.name,C.title (
( σ I.dpt_id =1 ( I( Instructor) )

σ T.sem=2 ( T( Teaches) )
)
⋈ π C.course_id, C.title ( C( Course) )
) ⋈ πC.course_id,C.title
σ I.dpt_id =1 ( I( Instructor) ) ⋈ σ T.sem=2 ( T( Teaches) )
σ I.dpt_id=1 σ T.sem=2 C
instructor_id name dpt_id course_id sem
1 Kit 1 2 2
I T
Natural join evaluates 2*2 = 4 combinations.
π C.course_id, C.title ( C( Course) )
course_id title
σ I.dpt_id =1 ( I( Instructor) ) σ T.sem=2 ( T( Teaches) )
1 Intro to DB
instructor_id name dpt_id instructor_id course_id sem 2 Programming I
1 Kit 1 1 2 2 3 Accounting
2 Ben 1 3 3 2 4 Algorithms

Instructor Teaches Course


instructor_id name dpt_id instructor_id course_id sem course_id title credit
1 Kit 1 1 1 1 1 Intro to DB 6
2 Ben 1 1 2 2 2 Programming I 6
3 Michael 2 2 4 1 3 Accounting 6
4 William 3 3 3 2 4 Algorithms 6 70
Illustration (transformed tree)
πI.name,C.title ( πI.name,C.title
( σ I.dpt_id =1 ( I( Instructor) )

σ T.sem=2 ( T( Teaches) ) ⋈
)
⋈ π C.course_id, C.title ( C( Course) )
) ⋈ πC.course_id,C.title
σ I.dpt_id =1 ( I( Instructor) ) ⋈ σ T.sem=2 ( T( Teaches) )
σ I.dpt_id=1 σ T.sem=2 C
instructor_id name dpt_id course_id sem
1 Kit 1 2 2
I T
Natural join evaluates 1*4 = 4 combinations.
π C.course_id, C.title ( C( Course) )
course_id title
σ I.dpt_id =1 ( I( Instructor) ) ⋈ σ T.sem=2 ( T( Teaches) ) ⋈ π C.course_id, C.title ( C( Course) ) 1 Intro to DB
2 Programming I
instructor_id name dpt_id course_id sem title 3 Accounting
1 Kit 1 2 2 Programming I 4 Algorithms

πI.name,C.title ( σ I.dpt_id =1 ( I( Instructor) ) ⋈ σ T.sem=2 ( T( Teaches) ) ⋈ π C.course_id, C.title ( C( Course) ) )
name title
Kit Programming I 71
Summary
Relational algebra (RA) defines a set of algebraic
operations on tables, and output tables as result.
6 fundamental operations (in Chapter 4A).
Additional operations does not extend the power of the
fundamental operators, but they simplify the expression.
Extended operations add expressive power.

Relational algebra (RA) is the basics of query


optimization.
More optimization techniques are discussed in Chapter 13 of
the textbook. 72
Chapter 4B.

END
COMP3278 Introduction to
Database Management Systems

Slides prepared by - Dr. Chui Chun Kit, http://www.cs.hku.hk/~ckchui/ for students in COMP3278
For other uses, please email : [email protected]

You might also like