Relational Algebra
Relational Algebra
operator, S is the new relation name and B1, B2,…Bn are the new attribute names.
The first expression renames both the relation and its attributes, the second renames the relation only,
and the third renames the attributes only.
Relational Algebra Operations from Set Theory: The UNION, INTERSECTION, & MINUS
Operations:
These are binary operations; that is, each is applied to two sets of tuples.
Two relations R(A1,A2,….An) and S(B1,B2,….Bn) are said to be union compatible if they have the
same degree n and if dom(Ai) = dom(Bi) for 1≤i≤n.
This means that the two relations have the same number of attributes and each corresponding pair of
attributes has the same domain.
We can define the three operations UNION, INTERSECTION and SET DIFFERENCE (also called
MINUS) on the two union-compatible relations R and S as follows:
UNION: The result of this operation, denoted by RUS, is a relation that includes all tuples that
are either in R or in S or in both R and S. Duplicate tuples are eliminated.
INTERSECTION: The result of this operation, denoted by R∩S, is a relation that includes all
tuples that are in both R and S.
SET DIFFERENCE (or MINUS): The result of this operation denoted by R-S, is a relation that
includes all tuples that are in R but not in S.
The UNION and INTERSECTION are commutative operations, that is, RUS=SUR and R∩S=S∩R.
Also, both are associative operations, that is, R U (S U T) = (R U S) U T and (R∩S)∩T = R∩(S∩T).
But, the MINUS operation is not commutative; that is, in general, R-S≠S-R.
2
It produces a new element by combining every member (tuple) from one relation (set) with every
member (tuple) from the other relation (set).
In general, the result of R(A1,A2,….,An) × S(B1,B2,….,Bm) is a relation Q with degree n + m
attributes Q(A1,A2,….An,B1,B2,…,Bm), in that order. The resulting relation Q has one tuple for
each combination of tuples-one from R and one from S.
This operation is useful when formulated by a selection that matches values of attributes coming
from the component relations. For example, to retrieve the names of all instructors in the Physics
department and the courseID that they taught, we write:
Π (σInstructer.ID=Teaches.ID (σDeprtment_Name=’Physics’ (Instructor ×Teaches)))
Name,CourseID
3
Outer Join Operations:
The outer join operation works in a manner similar to the natural join operation but preserves those
tuples that would be lost in an inner join creating tuples in the result containing null values.
There are three forms of the operation: left outer join, denoted by ; right outer join, denoted by
; and full outer join, denoted by .
The left outer join ( ) takes all tuples in the left relation that did not match with any tuple in the
right relation, pads the tuples with null values for all other attributes from the right relation and adds
them to the result of the natural join.
The right outer join ( ) is symmetric with the left outer join. It pads tuples from the right relation
that did not match any from the left relation with nulls and adds them to the result of the natural join.
The full outer join ( ) does both the left and right outer join operations and adding them to the
result of the join.
4
The generalized form can be expressed as: Π F1, F2, ... Fn (R), where F1, F2,….,Fn are functions over the
attributes in relation R and may involve constants.
For example, Π ID, Name,Dept_name,Salary ÷12 (EMPLOYEE)
Aggregation:
The aggregate operation G, permits the use of aggregate functions such as min or average on sets of
values.
The symbol G
is the letter G in Calligraphic font; read it as “Calligraphic G”.
Aggregate functions take a collection of values and return a single value as a result. For example, the
relational-algebra expression for calculating the sum of salaries of all employees is:
G (EMPLOYEE)
sum (salary)
To eliminate multiple occurrences of a value before computing an aggregate function, use the
hyphenated string “distinct” appended to the end of the aggregate function name:
G (EMPLOYEE)
count-distinct (Address)
a list of attributes on which to group; each Fi is an aggregate function and each Ai is an attribute
name.
The Outer Union Operation:
It is used to take the union of tuples from two relations if the relations are not union compatible.
This operation will take the union of tuples in two relations R(X,Y) and S(X,Z) that are partially
compatible; meaning that only some of their attributes, say X, are union compatible.
Attributes that are union compatible are represented only once in the result, and those attributes that
are not union compatible from either relation are also kept in the result relation T (X, Y, Z). For
example, an OUTER UNION can be applied to two relations whose schemas are STUDENT (Name,
Department, Advisor) and INSTRUCTOR (Name, Department, Rank). The result relation will have
the following attributes:
Result←(Name, Department, Advisor, Rank)
All the tuples from both relations are included in the result.
Tuples appearing only in STUDENT will have a null for the Rank attribute, whereas, tuples
appearing only in INSTRUCTOR will have a null for the Advisor attribute.
A tuple that exists in both relations will have values for all its attributes.
Division Operation:
The division is a binary operation that is written as R÷S.
The result consists of the restrictions of tuples in R to the attribute names unique to R. for example,
see the tables COMPLETED, DBPROJECT and their division:
5
If DBPROJECT contains all the tasks of the database project, then the result of the division above
contains exactly the students who have completed both the task in the database project.
more formally, it is defined as:
R÷S = {t[a1,….an]: t ϵ R ^ s ϵ S((t[a1,…an] U S) ϵ R)} where { a1,….an} is the set of attribute
names unique to R and t[a1,…an] is the restriction of t to this set.
It is usually required that the attribute names in the header of S are subset of those of R because
otherwise the result of the operation will always be empty.
Examples of Queries in Relational Algebra: see from book.
A complete set of Relational Algebra Operations: The set of relational algebra operations {σ, π, ∪, ρ, –,
×} is a complete set because any of the relational algebra operations can be expressed as a sequence of
operations from this set. For example, the INTERSECTION operation can be expressed as follows:
R ∩ S = (R ∪ S) – ((R – S) ∪ (S – R))
Similarly, a NATURAL JOIN can be specified as a CARTESIAN PRODUCT preceded by RENAME and
followed by SELECT and PROJECT operations.
Relational Calculus:
Relational calculus is an alternative to relational algebra.
In relational calculus, we write one declarative expression to specify a retrieval request; hence, there
is no description of how to evaluate a query.
It specifies what is to be retrieved rather than how to retrieve it. Therefore, it is considered to be a
nonprocedural language.
Any retrieval that can be specified in the basic relational algebra can also be specified in relational
calculus, and vice-versa, i.e., the expressive power of the two languages is identical.
One variant is called Tuple Relational Calculus (TRC) and another is called Domain Relational
Calculus (DRC).
Relational Algebra and Calculus are the foundation of query languages like SQL.
Both are closed languages: results of queries on relations are relations.
Tuple Relational calculus:
The tuple relational calculus is based on specifying a number of tuple variables.
Variables in expressions represent a row of a relation (tuple variable).
A simple tuple relational calculus query is of the form: {t| COND(t)} where t is a tuple variable and
COND(t) is a conditional expression involving t. the result of such a query is the set of all tuples t
that satisfy COND(t).
Informally, we need to specify the following information in a tuple calculus expression:
A set of attributes to be retrieved, the requested attributes. E.g. t.Fname, t.Lname
For each tuple variable t, the range relation R. e.g. t ϵ EMPLOYEE
A condition to select particular combination of tuples. E.g. t.Salary>50000
Examples of Queries in Tuple Relational (Basic operations only)
Find all employees whose salary is above $50,000.
{t | t ϵ EMPLOYEE ^ t.Salary>50000}
To retrieve only the first and last names of all employees whose salary is above $50,000:
{t.Fname, t.Lname | t ϵ EMPLOYEE ^ t.Salary > 50000}
Retrieve the birth date and address of the employees whose name is ‘Hari Bahadur Karki’.
{t.Bdate, t.Address | t ϵ EMPLOYEE ^ t.Fname=’Hari’ ^ t.Mname=’Bahadur’ ^ t.Lname=’Karki’}
List the name and address of all employees who work for the ‘Research’ department.
6
{t.Fname, t.Lname, t.Address | t ϵ EMPLOYEE ^ (∃d)(DEPARTMENT(d) ^ d.Dname=’Research’ ^
d.Dnumber=t.Dno)}
Retrieve the employee’s first and last name and the first and last name of his/her immediate
supervisor.
{e.Fname, e.Lname, s.Fname, s.Lname | e ϵ EMPLOYEE ^ s ϵ EMPLOYEE ^ e.Super_SSN=s.SSN}
Return tuples with Name from Author who has article on ‘DBMS’.
{t.Name | t ϵ Author ^ t.Article=’DBMS’}
A tuple variable t is bound if it appears in an (∃t) or (t) clause; otherwise, it is free.
The (∃) quantifier is called an existential quantifier because a formula (∃t)F is true if there exists
some tuple that makes F true.
The () quantifier is called the universal or for all quantifier because every tuple in the universe of
tuples must make F true to make the quantified formula (t) F true.
Find all students who have taken all courses offered in the Biology department:
{t| ∃r ϵ STUDENT (r.ID=t.ID ^ (u ϵ COURSE(u.Dept_Name=’Biology’ ∃s ϵ TAKES(t.ID=s.ID
^ s.CourseID = u.CourseID))}
Safe Expressions:
A safe expression in relational calculus is one that is guaranteed to yield a finite number of tuples as
its result; otherwise, the expression is called unsafe. For example, the expression:
{t| NOT (EMPLOYEE(t))} is unsafe because it yields all tuples in the universe that are not
EMPLOYEE tuples, which are infinitely numerous.
Domain Relational Calculus (Basic Operations only)
There is another type of relational calculus called the domain relational calculus, or simply, domain
calculus.
The formal specification of the domain calculus was proposed after the development of the QBE
(Query-By-Example) system.
Domain calculus differs from tuple calculus in the type of variables used in formulas; rather than
having variables range over tuples, the variables range over single values from domains of attributes.
To form a relation of degree n for a query result, we must have n of these domain variables-one for
each attribute.
An expression of the domain calculus is of the form:
{x1,x2,….xn | COND (x1,x2,…,xn,xn+1,xn+2,…..,xn+m)} where x1,x2,….xn,xn+1,xn+2,….xn+m
are domain variables that range over domains of attributes and COND is a condition or formula of
the domain relational calculus.
Examples:
Find the ID, Name, Dept_Name and Salary for employees whose salary is greater than 20,000:
{< i, n, d, s > | < i, n, d, s > ϵ EMPLOYEE ^ s > 20000}
Find all employee name for employees whose salary is greater than 20,000:
{<n> | ∃ i,d,s (< i,n,d,s> ϵ EMPLOYEE ^ s > 20000)}
Names of all customers:
{< ti, t > | ∃ tf, mi (< ti, tf, mi> ϵ TAPE ^ ∃ i, t, c, y, d, p, l (< i, t, c, y, d, p, l> ϵ MOVIE ^
(Table: TAPE(TID,TFormat, Movie_ID))
mi=i)}