0% found this document useful (0 votes)
39 views28 pages

Slides RelationalAlgebra

The document discusses relational algebra, which is a formalism for querying and manipulating relations through a set of operators. It describes the five basic relational algebra operators: selection, projection, union, difference, and cartesian product. It also covers derived operators like joins, renaming, and semijoins. Examples are provided to demonstrate how each operator works and how they can be combined to form more complex expressions. Finally, it notes some limitations of relational algebra in expressing certain queries.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views28 pages

Slides RelationalAlgebra

The document discusses relational algebra, which is a formalism for querying and manipulating relations through a set of operators. It describes the five basic relational algebra operators: selection, projection, union, difference, and cartesian product. It also covers derived operators like joins, renaming, and semijoins. Examples are provided to demonstrate how each operator works and how they can be combined to form more complex expressions. Finally, it notes some limitations of relational algebra in expressing certain queries.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 28

Lecture 07: Relational Algebra

Outline
Relational Algebra (Section 6.1)

Relational Algebra
Formalism for creating new relations from existing ones Its place in the big picture:
Declarative query language SQL, relational calculus

Algebra Relational algebra

Implementation

Relational Algebra
Five operators:
Union: Difference: Selection: s Projection: P Cartesian Product:

Derived or auxiliary operators:


Intersection, complement Joins (natural,equi-join, theta join, semi-join) Renaming: r

1. Union and 2. Difference


R1 R2 Example:
ActiveEmployees RetiredEmployees

R1 R2 Example:
AllEmployees RetiredEmployees
5

What about Intersection ?


It is a derived operator R1 R2 = R1 (R1 R2) Also expressed as a join (will see later) Example
UnionizedEmployees RetiredEmployees

3. Selection
Returns all tuples which satisfy a condition Notation: sc(R) Examples
sSalary > 40000 (Employee) sname = Smith (Employee)

The condition c can be =, <, , >, , <>


[in SQL: SELECT * FROM Employee WHERE Salary > 40000]
7

Selection Example Employee SSN 999999999 777777777 888888888

Name John Tony Alice

DepartmentID 1 1 2

Salary 30,000 32,000 45,000

Find all employees with salary more than $40,000. s Salary > 40000 (Employee)

SSN Name 888888888 Alice

DepartmentID 2

Salary 45,000
8

4. Projection
Eliminates columns, then removes duplicates Notation: P A1,,An (R) Example: project to social-security number and names:
P SSN, Name (Employee) Output schema: Answer(SSN, Name)
[In SQL: SELECT DISTINCT SSN, Name FROM Employee]
9

Projection Example Employee SSN 999999999 777777777 888888888

Name John Tony Alice

DepartmentID 1 1 2

Salary 30,000 32,000 45,000

P SSN, Name (Employee)


SSN 999999999 777777777 888888888 Name John Tony Alice
10

5. Cartesian Product
Combine each tuple in R1 with each tuple in R2 Notation: R1 R2 Example:
Employee Dependents

Very rare in practice; mainly used to express joins


[In SQL: SELECT * FROM R1, R2]
11

Cartesian Product Example Employee Name John Tony Dependents EmployeeSSN 999999999 777777777

SSN 999999999 777777777

Dname Emily Joe

Employee x Dependents Name SSN EmployeeSSN John 999999999 999999999 John 999999999 777777777 Tony 777777777 999999999 Tony 777777777 777777777

Dname Emily Joe Emily Joe

12

Relational Algebra
Five operators:
Union: Difference: Selection: s Projection: P Cartesian Product:

Derived or auxiliary operators:


Intersection, complement Joins (natural,equi-join, theta join, semi-join) Renaming: r

13

Renaming
Changes the schema, not the instance Schema: R(A1, , An ) Notation: r B1,,Bn (R) Example:
rLastName, SocSocNo (Employee) Output schema: Answer(LastName, SocSocNo)
[in SQL: SELECT Name AS LastName, SSN AS SocSocNo FROM Employee] 14

Renaming Example
Employee Name John Tony SSN 999999999 777777777

rLastName, SocSocNo (Employee)


LastName John Tony SocSocNo 999999999 777777777
15

Natural Join
Notation: R1 R2 Meaning: R1 R2 = PA(sC(R1 R2))

Where:
The selection sC checks equality of all common attributes The projection eliminates the duplicate common attributes
[in SQL: R2 SELECT DISTINCT R1.A, R1. B, R2.C FROM R1, WHERE R1.B = R2.B Schema: R1(A,B), R2(B,C)]
16

Natural Join Example Employee Name John Tony Dependents SSN 999999999 777777777 SSN 999999999 777777777

Dname Emily Joe

Employee Dependents = PName, SSN, Dname(s SSN=SSN2(Employee x rSSN2, Dname(Dependents)) Name John Tony SSN 999999999 777777777 Dname Emily Joe
17

Natural Join
R=
A X X Y Z B Y Z Z V

S=

B Z V Z

C U W V

R S=

A X

B Z

C U

X
Y Y Z

Z
Z Z V

V
U V W
18

Natural Join
Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R S ?
Given R(A, B, C), S(D, E), what is R S ?

Given R(A, B), S(A, B), what is R S ?


19

Theta Join
A join that involves a predicate R1 q R2 = s q (R1 R2) Here q can be any condition

20

Eq-join
A theta join where q is an equality R1 A=B R2 = s A=B (R1 R2) Example:
Employee SSN=SSN Dependents

Most useful join in practice (difference to natural join?)


21

Semijoin
R S = P A1,,An (R S) Where A1, , An are the attributes in R Example:
Employee Dependents

22

Semijoins in Distributed Databases


Semijoins are used in distributed databases
Dependents Employee
SSN ... Name ... SSN ... Dname Age ...

network

Employee ssn=ssn (s age>71 (Dependents))


R = Employee T T = P SSN s age>71 (Dependents)
23 Answer = R Dependents

Complex RA Expressions
P name
buyer-ssn=ssn pid=pid

seller-ssn=ssn

P ssn sname=fred
Person Purchase Person

P pid sname=gizmo
Product 24

Application: Query Rewriting for Optimization


sname

sname
bid=100 rating > 5

sid=sid sid=sid

(Scan; write to bid=100 temp T1) Sailors Reserves

rating > 5

(Scan; write to temp T2)

Reserves

Sailors

The earlier we process selections, less tuples we need to manipulate higher up in the tree (predicate pushdown) Disadvantages?
25

Algebraic Laws (Examples)


Commutative and Associative Laws
R S = S R, R (S T) = (R S) T R S = S R, R (S T) = (R S) T

Laws involving selection


s C AND C(R) = s C(s C(R)) = s C(R) s C(R) s C (R S) = s C (R) S
When C involves only attributes of R

Laws involving projections


PM(PN(R)) = PM,N(R)
26

Operations on Bags
A bag = a set with repeated elements All operations need to be defined carefully on bags {a,b,b,c}{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f} {a,b,b,b,c,c} {b,c,c,c,d} = {a,b,b} sC(R): preserve the number of occurrences PA(R): no duplicate elimination Cartesian product, join: no duplicate elimination Important ! Relational Engines work on bags, not sets !
27

Finally: RA has Limitations !


Cannot compute transitive closure
Name1 Fred Mary Mary Nancy Name2 Mary Joe Bill Lou Relationship Father Cousin Spouse Sister

Find all direct and indirect relatives of Fred Cannot express in RA !!! Need to write C program
28

You might also like