Slides RelationalAlgebra
Slides RelationalAlgebra
Outline
Relational Algebra (Section 6.1)
Relational Algebra
Formalism for creating new relations from existing ones Its place in the big picture:
Declarative query language SQL, relational calculus
Implementation
Relational Algebra
Five operators:
Union: Difference: Selection: s Projection: P Cartesian Product:
R1 R2 Example:
AllEmployees RetiredEmployees
5
3. Selection
Returns all tuples which satisfy a condition Notation: sc(R) Examples
sSalary > 40000 (Employee) sname = Smith (Employee)
DepartmentID 1 1 2
Find all employees with salary more than $40,000. s Salary > 40000 (Employee)
DepartmentID 2
Salary 45,000
8
4. Projection
Eliminates columns, then removes duplicates Notation: P A1,,An (R) Example: project to social-security number and names:
P SSN, Name (Employee) Output schema: Answer(SSN, Name)
[In SQL: SELECT DISTINCT SSN, Name FROM Employee]
9
DepartmentID 1 1 2
5. Cartesian Product
Combine each tuple in R1 with each tuple in R2 Notation: R1 R2 Example:
Employee Dependents
Cartesian Product Example Employee Name John Tony Dependents EmployeeSSN 999999999 777777777
Employee x Dependents Name SSN EmployeeSSN John 999999999 999999999 John 999999999 777777777 Tony 777777777 999999999 Tony 777777777 777777777
12
Relational Algebra
Five operators:
Union: Difference: Selection: s Projection: P Cartesian Product:
13
Renaming
Changes the schema, not the instance Schema: R(A1, , An ) Notation: r B1,,Bn (R) Example:
rLastName, SocSocNo (Employee) Output schema: Answer(LastName, SocSocNo)
[in SQL: SELECT Name AS LastName, SSN AS SocSocNo FROM Employee] 14
Renaming Example
Employee Name John Tony SSN 999999999 777777777
Natural Join
Notation: R1 R2 Meaning: R1 R2 = PA(sC(R1 R2))
Where:
The selection sC checks equality of all common attributes The projection eliminates the duplicate common attributes
[in SQL: R2 SELECT DISTINCT R1.A, R1. B, R2.C FROM R1, WHERE R1.B = R2.B Schema: R1(A,B), R2(B,C)]
16
Natural Join Example Employee Name John Tony Dependents SSN 999999999 777777777 SSN 999999999 777777777
Employee Dependents = PName, SSN, Dname(s SSN=SSN2(Employee x rSSN2, Dname(Dependents)) Name John Tony SSN 999999999 777777777 Dname Emily Joe
17
Natural Join
R=
A X X Y Z B Y Z Z V
S=
B Z V Z
C U W V
R S=
A X
B Z
C U
X
Y Y Z
Z
Z Z V
V
U V W
18
Natural Join
Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R S ?
Given R(A, B, C), S(D, E), what is R S ?
Theta Join
A join that involves a predicate R1 q R2 = s q (R1 R2) Here q can be any condition
20
Eq-join
A theta join where q is an equality R1 A=B R2 = s A=B (R1 R2) Example:
Employee SSN=SSN Dependents
Semijoin
R S = P A1,,An (R S) Where A1, , An are the attributes in R Example:
Employee Dependents
22
network
Complex RA Expressions
P name
buyer-ssn=ssn pid=pid
seller-ssn=ssn
P ssn sname=fred
Person Purchase Person
P pid sname=gizmo
Product 24
sname
bid=100 rating > 5
sid=sid sid=sid
rating > 5
Reserves
Sailors
The earlier we process selections, less tuples we need to manipulate higher up in the tree (predicate pushdown) Disadvantages?
25
Operations on Bags
A bag = a set with repeated elements All operations need to be defined carefully on bags {a,b,b,c}{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f} {a,b,b,b,c,c} {b,c,c,c,d} = {a,b,b} sC(R): preserve the number of occurrences PA(R): no duplicate elimination Cartesian product, join: no duplicate elimination Important ! Relational Engines work on bags, not sets !
27
Find all direct and indirect relatives of Fred Cannot express in RA !!! Need to write C program
28