Chapter 5 - Relational Algebra
Chapter 5 - Relational Algebra
Relational Algebra
1
Relational Algebra
Relational algebra is a theoretical language with operations that
work on one or more relations to define another relation without
changing the original relation.
The basic set of operations for the relational model is known as
the relational algebra.
These operations enable a user to specify basic retrieval requests.
The result of the retrieval is a new relation, which may have been
formed from one or more relations.
A sequence of relational algebra operations forms a relational
algebra expression, whose result will also be a relation that
represents the result of a database query (or retrieval request).
2
Relational Algebra … Cont’d
The output from one operation can become the input to another
operation (nesting is possible)
There are different basic operations that could be applied on relations
on a database based on the requirement.
Selection ( σ ): Selects a subset of rows from a relation.
Projection ( π ): Deletes unwanted columns from a relation.
Renaming: assigning intermediate relation for a single operation
Cross-Product ( x ): Allows us to combine two relations.
Set-Difference ( - ): Tuples in relation1, but not in relation2.
Union (∪ ): Tuples in relation1 or in relation2.
Intersection (∩): Tuples in relation1 and in relation2
Join :Tuples joined from two relations based on a condition
3
Relational Algebra … Cont’d
Table1: Sample table used to illustrate different kinds of relational
operations. The relation contains information about employees, IT skills
they have and the school where they attend each skill.
4
Selection
Selects subset of tuples/rows in a relation that satisfy selection condition.
Selection operation is a unary operator (it is applied to a single
relation)
The Selection operation is applied to each tuple individually
The degree of the resulting relation is the same as the original relation
but the cardinality (no. of tuples) is less than or equal to the original
relation.
The Selection operator is commutative.
Set of conditions can be combined using Boolean operations (∧(AND),
∨(OR), and ~(NOT))
No duplicates in result.
5
Selection … Cont’d
Result relation can be the input for another relational algebra operation
(Operator composition).
It is a filter that keeps only those tuples that satisfy a qualifying
condition i.e. those satisfying the condition are selected while others
are discarded.
Notation:
<Selection Condition> <Relation Name>
Example: Find all Employees with skill type of Database.
< SkillType =”Database”> (Employee)
This query will extract every tuple from a relation called Employee
with all the attributes where the SkillType attribute with a value of
“Database”.
6
Selection … Cont’d
The resulting relation will be the following.
Example: Find all Employee with SkillType Database and School Unity
the relational?
< SkillType =”Database” AND School=”Unity”> (Employee)
7
Projection
Selects certain attributes while discarding the other from the base
relation.
The PROJECT creates a vertical partitioning.
Deletes attributes that are not in projection list.
Schema of result contains exactly the fields in the projection list, with
the same names that they had in the (only) input relation.
Projection operator has to eliminate duplicates.
Note: real systems typically don’t do duplicate elimination unless
the user explicitly asks for it.
If the Primary Key is in the projection list, then duplication will not
occur
Duplication removal is necessary to insure that the resulting table is
also a relation.
8
Projection … Cont’d
Notation:
π <Selected Attributes> <Relation Name>
Example: To display Name, Skill, and Skill Level of an employee, the
query and the resulting relation will be:
9
Projection … Cont’d
Exercise: Write a relational operation that display Name, Skill and
Skill Level of an employee with Skill SQL and SkillLevel greater than
5?
10
Rename Operation
Allows us to name, and therefore to refer to, the results of relational
algebra expressions.
Allows us to refer to a relation by more than one name.
Example: x (E) returns the expression E under the name X
We may want to apply several relational algebra operations one after
the other. The query could be written in two different forms:
1. Write the operations as a single relational algebra
expression by nesting the operations.
2. Apply one operation at a time and create intermediate result
relations.
In the latter case, we must give names to the relations that hold the
intermediate results.
11
Rename Operation … Cont’d
If we want to have the Name, Skill, and Skill Level of an employee
with salary greater than 1500 and working for department 5, we
can write the expression for this query using the two alternatives:
• Then Result will be equivalent with the relation we get using the
first alternative.
12
UNION Operation
The result of this operation, denoted by R U S, is a relation that includes
all tuples that are either in R or in S or in both R and S.
Duplicate tuples are eliminated.
The two operands must be “type compatible”.
Type Compatibility
The operand relations R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) must
have the same number of attributes, and the domains of corresponding
attributes must be compatible; that is, Dom(Ai)=Dom(Bi) for i=1, 2,.. , n.
The resulting relation for;
R1 ∪ R2,
R1 ∩ R2, or
R1 - R2 has the same attribute names as the first operand relation R1
(by convention).
13
UNION Operation … Cont’d
Example
14
INTERSECTION Operation
The result of this operation, denoted by R ∩ S, is a relation that
includes all tuples that are in both R and S.
The two operands must be "type compatible" ∩
15
Set Difference (or MINUS) Operation
The result of this operation, denoted by R - S, is a relation that
includes all tuples that are in R but not in S.
The two operands must be "type compatible”.
Some Properties of the Set Operators
Notice that both union and intersection are commutative
operations; that is
R ∪ S = S ∪ R, and R ∩ S = S ∩ R
Both union and intersection can be treated as n-nary operations
applicable to any number of relations as both are associative
operations; that is
R ∪ (S ∪ T) = (R ∪ S) ∪ T, and (R ∩ S) ∩ T = R ∩ (S ∩ T)
The minus operation is not commutative; that is, in general
R-S≠S–R
16
Set Difference (or MINUS) Operation … Cont’d
17
CARTESIAN (cross product) Operation
This operation is used to combine tuples from two relations in a
combinatorial fashion.
That means, every tuple in Relation1 (R) one will be related with
every other tuple in Relation2 (S).
In general, the result of R(A1, A2, . . ., An) x S(B1,B2, . . ., Bm) is a
relation Q with degree n + m attributes Q(A1, A2, . . ., An, B1,
B2, . . ., Bm), in that order.
Where R has n attributes and S has m attributes.
The resulting relation Q has one tuple for each combination of tuples
i.e. one from R and one from S.
Hence, if R has n tuples, and S has m tuples, then | R x S | will have
n* m tuples.
The two operands do NOT have to be "type compatible”
18
CARTESIAN (cross product) Operation … Cont’d
Example
19
CARTESIAN (cross product) Operation … Cont’d
20
JOIN Operation
The sequence of Cartesian product followed by select is used quite
commonly to identify and select related tuples from two relations.
This special operation is called JOIN.
JOIN Operation is denoted by a symbol.
This operation is very important for any relational database with more
than a single relation, because it allows us to process relationships
among relations.
The general form of a join operation on two relations:
R(A1, A2, . . ., An) and S(B1, B2, . . ., Bm) is:
Where R and S can be any relations that result from general relational
algebra expressions
21
JOIN Operation … Cont’d
Since JOIN function in two relation, it is a Binary operation.
The type of JOIN which is called THETA JOIN (θ - JOIN) used θ as
a logical operator used in the join condition.
θ Could be { <, ≤ , >, ≥, ≠, = }
In Theta join tuples whose join attributes are null do not appear in the
result.
22
JOIN Operation … Cont’d
23
JOIN Operation … Cont’d
24
EQUIJOIN Operation
The most common use of join involves join conditions with equality
comparisons only ( = ).
Such a join, where the only comparison operator used is called an
EQUIJOIN.
25
NATURAL JOIN Operation
The standard definition of natural join requires that the two join
attributes, or each pair of corresponding join attributes, have the
same name in both relations.
If this is not the case, a renaming operation on the attributes is applied
first. The result of the natural join is the set of all combinations of
tuples in R and S that are equal on their common attribute names.
26
OUTER JOIN Operation
OUTER JOIN is another version of the JOIN operation where non
matching tuples from the first Relation are also included in the
resulting Relation where attributes of the second Relation for a non
matching tuples from Relation one will have a value of NULL.
An extension of the join operation that avoids loss of information.
Outer Join Can be:
Left Outer Join
Right Outer Join
Full Outer Join
27
Left Outer Join
The result of the left outer join is
the set of all combinations of
tuples in R and S that are equal on
their common attribute names, in
addition to tuples in R that have no
matching tuples in S.
28
Right Outer Join
The result of the right outer
join is the set of all
combinations of tuples in R and
S that are equal on their
common attribute names, in
addition to tuples in S that have
no matching tuples in R.
29
Full Outer Join
The result of the full outer join is
the set of all combinations of
tuples in R and S that are equal on
their common attribute names, in
addition to tuples in S that have no
matching tuples in R and tuples in
R that have no matching tuples in
S in their common attribute names
30
SEMIJOIN Operation
31
Relational Calculus
A relational calculus expression creates a new relation, which is
specified in terms of variables that range over rows of the stored
database relations (in tuple calculus) or over columns of the stored
relations (in domain calculus).
In a calculus expression, there is no order of operations to specify how
to retrieve the query result.
A calculus expression specifies only what information the result should
contain rather than how to retrieve it.
In Relational calculus, there is no description of how to evaluate a
query, this is the main distinguishing feature between relational algebra
and relational calculus.
Relational calculus is considered to be a non procedural language.
32
Relational Calculus … Cont’d
This differs from relational algebra, where we must write a sequence
of operations to specify a retrieval request.
Hence relational algebra can be considered as a procedural way of
stating a query.
When applied to relational database, the calculus is not that of
derivative and differential but in a form of first-order logic or
predicate calculus
A predicate is a truth-valued function with arguments.
When we substitute values for the arguments in the predicate, the
function yields an expression, called a proposition, which can be
either true or false.
33
Relational Calculus … Cont’d
If a predicate contains a variable, as in ‘x is a member of staff’, there
must be a range for x. When we substitute some values of this range for
x, the proposition may be true; for other values, it may be false.
If COND is a predicate, then the set off all tuples evaluated to be true
for the predicate COND will be expressed as follows:
{t | COND(t)}
Where t is a tuple variable and COND (t) is a conditional expression
involving t. The result of such a query is the set of all tuples t that
satisfy COND (t).
If we have set of predicates to evaluate for a single query, the predicates
can be connected using ∧(AND), ∨(OR), and ~(NOT)
34
Tuple-oriented Relational Calculus
The tuple relational calculus is based on specifying a number of tuple
variables.
Tuple relational calculus is interested in finding tuples for which a
predicate is true for a relation.
Based on use of tuple variables.
Tuple variable is a variable that ‘ranges over’ a named relation: that is,
a variable whose only permitted values are tuples of the relation.
If E is a tuple that ranges over a relation employee, then it is
represented as EMPLOYEE(E) i.e. Range of E is EMPLOYEE
Then to extract all tuples that satisfy a certain condition, we will
represent it as all tuples E such that COND(E) is evaluated to be true.
35
Tuple-oriented Relational Calculus … Cont’d
{E ⁄ COND(E)}
The predicates can be connected using the Boolean operators:
∧ (AND), ∨ (OR), ∼ (NOT)
COND(t) is a formula, and is called a Well-Formed-Formula (WFF)
if:
Where the COND is composed of n-nary predicates (formula
composed of n single predicates) and the predicates are
connected by any of the Boolean operators.
37
Tuple-oriented Relational Calculus … Cont’d
Exercise: Find EmpId, FName, LName, Skill and School where the
skill is attended where of employees with skill level greater than or equal
to 8.
38
Quantifiers in Relation Calculus
To tell how many instances the predicate applies to, we can use the
two quantifiers in the predicate logic.
One relational calculus expressed using Existential Quantifier can
also be expressed using Universal Quantifier.
39
Quantifiers in Relation Calculus … Cont’d
2. Universal quantifier ∀ (‘for all’)
Universal quantifier is used in statements about every instance, such
as:
An employee with skill level greater than or equal to 8 will be:
{E | Employee(E) ∧ (∀E)(E.SkillLevel >= 8)}
This means, for all tuples of relation employee where value for the
SkillLevel attribute is greater than or equal to 8.
40