0% found this document useful (0 votes)
49 views7 pages

Relational Algebra

Uploaded by

ramehoinama
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views7 pages

Relational Algebra

Uploaded by

ramehoinama
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Relational Algebra:

 Relational algebra is a procedural query language.


 It defines a set of operations for the relational data model. These operations enable a user to specify
basic retrieval requests, whose result is a new relation.
 In addition to defining database structure and constraints, a data model must include a set of operations
to manipulate the database. The basic set of operations for the relational model is the relational algebra.
 The relational algebra is very important for several reasons:
 It provides a formal foundation for implementing and optimizing queries in RDBMSs.
 Some of its concepts are incorporated into the SQL standard query language for RDBMSs.
 The core operations and functions of any relational system are based on relational algebra
operations, although no commercial RDBMS provides an interface for relational algebra queries.
 Those relational operations that operate on single relation are called unary operations and those operate
on pairs of relations are called binary operations.
The Select Operation:
 It is used to select a subset of the tuples from a relation that satisfies a selection condition.
 It can be considered to be a filter that keeps only those tuples that satisfy a qualifying condition. For
example, to select the employee tuples whose department is 4, or those whose salary is greater than Rs.
30,000, we can individually specify each of these two conditions with a SELECT operation as follows:
σDno=4 (EMPLOYEE)
σSalary > 30,000 (EMPLOYEE)
 In general, the SELECT operation is denoted by σ<Selection condition> (R) where the symbol σ (sigma) is
used to denote the SELECT operator and the selection condition is a Boolean expression specified on the
attributes of relation R.
 The relation resulting from the SELECT operation has the same attributes as R.
 R is generally a relational algebra expression whose result is a relation. The simplest such expression is
just the name of a database relation.
 The SELECT operator is unary, that is, it is applied to a single relation.
 The SELECT operation is commutative, that is,
σ <condition1> (σ<condition2> (R)) = σ <condition2> (σ<condition1> (R))
 We can also combine a cascade of SELECT operations into a single operation as follows:
σ <cond1> (σ<cond2> (…(σ <condN> (R))…))= σ<cond1> AND <cond2> AND...AND <condN> (R)
Examples:
σSalary > 30,000 (σDno=4 (EMPLOYEE))=σDno=4 (σSalary > 30,000 (EMPLOYEE))
σDno=4 And Salary > 30,000 (EMPLOYEE)
The Project Operation:
 The SELECT operation selects some of the rows from the table while discarding other rows.
 The PROJECT operation selects certain columns from the table and discards the other columns.
 If we are interested in only certain attributes of a relation, we use the PROJECT operation. Therefore,
the result of the PROJECT operation can be visualized as a vertical partition of the relation into two
relations: one has the needed columns (attributes) and contains the result of the operation and the other
contains the discarded columns.
 For example, to list each employee’s first and last name and salary, we can use the PROJECT operation
as follows:
Π (EMPLOYEE)
Lname,Fname,Salary

 The general form of the PROJECT operation is:


Π (R) where Π (pi) is the symbol used to represent the PROJECT operation, and <attribute list
<attribute list>

> is the desired list of attributes from the attributes of relation R.


1
 In general, R is a relational algebra expression whose result is a relation, which in the simplest case is
just the name of a database relation.
 The PROJECT operation removes any duplicate tuples, so the result of the PROJECT operation is a set
of tuples. This is known as duplicate elimination.
 Π Fname,Salary (ΠFname,phone,Salary (EMPLOYEE))
 Π Fname,phone,Salary (ΠFname,Salary (EMPLOYEE))

 The commutative law does not hold on PROJECT operation.


The Rename Operation:
 We can write several relational algebra operation as a single relational algebra expression as:
Π Lname,Fname,Salary (σDno=4 (EMPLOYEE))
 We can also apply one operation at a time and create intermediate result relations. In this case, we must
give names to the relations that hold the intermediate results, as follows:
DEP5_EMPS← σDno=5 (EMPLOYEE)
RESULT← Π Lname,Fname,Salary (DEP5_EMPS)
 To rename the attributes in a relation, simply list the new attribute names in parentheses, as shown
below:
R (First_Name, Last_Name, Salary) ← Π Fname,Lname,Salary (TEMP)
 If no renaming is applied, the names of the attributes in the resulting relation of a SELECT operation are
the same as those in the original relation. For a PROJECT operation with no renaming, the resulting
relation has the same attribute names as those in the projection list.
 The general RENAME operation when applied to a relation R of degree n is denoted by any of the
following three forms:
ρ (R) or, ρS (R) or ρ(B1,B2,…Bn) (R) where the symbol ρ(rho) is used to denote the RENAME
S(B1,B2,…Bn)

operator, S is the new relation name and B1, B2,…Bn are the new attribute names.
 The first expression renames both the relation and its attributes, the second renames the relation only,
and the third renames the attributes only.
Relational Algebra Operations from Set Theory: The UNION, INTERSECTION, & MINUS
Operations:
 These are binary operations; that is, each is applied to two sets of tuples.
 Two relations R(A1,A2,….An) and S(B1,B2,….Bn) are said to be union compatible if they have the
same degree n and if dom(Ai) = dom(Bi) for 1≤i≤n.
 This means that the two relations have the same number of attributes and each corresponding pair of
attributes has the same domain.
 We can define the three operations UNION, INTERSECTION and SET DIFFERENCE (also called
MINUS) on the two union-compatible relations R and S as follows:
 UNION: The result of this operation, denoted by RUS, is a relation that includes all tuples that
are either in R or in S or in both R and S. Duplicate tuples are eliminated.
 INTERSECTION: The result of this operation, denoted by R∩S, is a relation that includes all
tuples that are in both R and S.
 SET DIFFERENCE (or MINUS): The result of this operation denoted by R-S, is a relation that
includes all tuples that are in R but not in S.
 The UNION and INTERSECTION are commutative operations, that is, RUS=SUR and R∩S=S∩R.
 Also, both are associative operations, that is, R U (S U T) = (R U S) U T and (R∩S)∩T = R∩(S∩T).
 But, the MINUS operation is not commutative; that is, in general, R-S≠S-R.

The Cartesian product (Cross Product) Operation:


 It is also known as CROSS PRODUCT or CROSS JOIN and is denoted by ×.
 It is a binary operation but the relations do not have to be union compatible.

2
 It produces a new element by combining every member (tuple) from one relation (set) with every
member (tuple) from the other relation (set).
 In general, the result of R(A1,A2,….,An) × S(B1,B2,….,Bm) is a relation Q with degree n + m
attributes Q(A1,A2,….An,B1,B2,…,Bm), in that order. The resulting relation Q has one tuple for
each combination of tuples-one from R and one from S.
 This operation is useful when formulated by a selection that matches values of attributes coming
from the component relations. For example, to retrieve the names of all instructors in the Physics
department and the courseID that they taught, we write:
Π (σInstructer.ID=Teaches.ID (σDeprtment_Name=’Physics’ (Instructor ×Teaches)))
Name,CourseID

Binary Relational Operations: JOIN and DIVISION


The Join Operation:
 The JOIN operation, denoted by , is used to combine related tuples from two relations into single
tuple.
 It allows us to process relationships among relations. For example, to retrieve the name of the
manager of each department we need to combine each department tuple with the employee tuple
whose Emp_ID value matches the Mgr_ID in the department tuple. We do this by using the JOIN
operation and then projecting the result over the necessary attributes, as follows:
DEPT_MGR←DEPARTMENT Mgr_ID=Emp_ID EMPLOYEE
RESULT← Π DName,Fname,Lname (DEPT_MGR)
 The general form of a JOIN operation on two relations R(A1, A2,….An) and S(B1, B2,…Bm) is:
R <join condition>S
 The result of the JOIN is a relation Q with n+m attributes Q(A1,A2,….An,B1,B2,…Bm) in that
order; Q has one tuple from each combination of tuples-one from R and one from S-whenever the
combination satisfies the join condition.
 A general join condition is of the form: <condition> AND <condition> AND ...AND <condition>
where each condition is of the form Ai ɵ Bj, Ai is an attribute of R and Bj is an attribute of S, Ai and
Bj have the same domain and ɵ (theta) is one of the comparison operators {=, <, ≤, >, ≥, ≠}.
 A join operation with such a general join condition is called a THETA JOIN.
 Tuples whose join attributes are NULL or for which the join condition is FALSE do not appear in the
result.
Variations of JOIN: The EQUI JOIN and NATURAL JOIN
 A JOIN, where the only comparison operator used is =, is called an EQUI JOIN.
 In the result of an EQUI JOIN, we always have one or more pairs of attributes that have identical
values in every tuple. For example, the value of the attributes Mgr_ID and Emp_ID are identical in
every tuple of DEPT_MGR.
 A new operation called NATURAL JOIN, denoted by * was created to get rid of the second attribute
(which is superfluous) in an EQUI JOIN condition.
 NATURAL JOIN in basically an EQUI JOIN followed by removal of the superfluous attributes.
 The standard definition of NATURAL JOIN requires that the two join attributes must have the same
name in both relations. If this is not the case, a renaming operation is applied first.
DEPT←ρ(Dname,Dnum,Mgr_ID,Mgr_StartDate)(DEPARTMENT)
PROJ_DEPT←PROJECT * DEPT
 In general, NATURAL JOIN is performed by equating all attribute pairs that have the same name in
the two relations.
 These operations are also called inner join.

3
Outer Join Operations:
 The outer join operation works in a manner similar to the natural join operation but preserves those
tuples that would be lost in an inner join creating tuples in the result containing null values.
 There are three forms of the operation: left outer join, denoted by ; right outer join, denoted by
; and full outer join, denoted by .
 The left outer join ( ) takes all tuples in the left relation that did not match with any tuple in the
right relation, pads the tuples with null values for all other attributes from the right relation and adds
them to the result of the natural join.

 The right outer join ( ) is symmetric with the left outer join. It pads tuples from the right relation
that did not match any from the left relation with nulls and adds them to the result of the natural join.

 The full outer join ( ) does both the left and right outer join operations and adding them to the
result of the join.

Additional Relational Operations:


Generalized Projection:
 It extends the projection operation by allowing functions of attributes to be included in the projection
list.

4
 The generalized form can be expressed as: Π F1, F2, ... Fn (R), where F1, F2,….,Fn are functions over the
attributes in relation R and may involve constants.
 For example, Π ID, Name,Dept_name,Salary ÷12 (EMPLOYEE)
Aggregation:
 The aggregate operation G, permits the use of aggregate functions such as min or average on sets of
values.
 The symbol G
is the letter G in Calligraphic font; read it as “Calligraphic G”.
 Aggregate functions take a collection of values and return a single value as a result. For example, the
relational-algebra expression for calculating the sum of salaries of all employees is:
G (EMPLOYEE)
sum (salary)

 To eliminate multiple occurrences of a value before computing an aggregate function, use the
hyphenated string “distinct” appended to the end of the aggregate function name:
G (EMPLOYEE)
count-distinct (Address)

 To find the average salary of all employees, we write:


G (EMPLOYEE)
average(salary)

 To find the average salary in each department, we write:


Dept_Name G (EMPLOYEE)
average (salary)

 The general form of the aggregate operation is as follows:


G1,G2,….Gn G (E) where E is any relational-algebra expression; G1,G2,…Gn constitute
F1(A1), F2(A2),….Fm(Am)

a list of attributes on which to group; each Fi is an aggregate function and each Ai is an attribute
name.
The Outer Union Operation:
 It is used to take the union of tuples from two relations if the relations are not union compatible.
 This operation will take the union of tuples in two relations R(X,Y) and S(X,Z) that are partially
compatible; meaning that only some of their attributes, say X, are union compatible.
 Attributes that are union compatible are represented only once in the result, and those attributes that
are not union compatible from either relation are also kept in the result relation T (X, Y, Z). For
example, an OUTER UNION can be applied to two relations whose schemas are STUDENT (Name,
Department, Advisor) and INSTRUCTOR (Name, Department, Rank). The result relation will have
the following attributes:
Result←(Name, Department, Advisor, Rank)
 All the tuples from both relations are included in the result.
 Tuples appearing only in STUDENT will have a null for the Rank attribute, whereas, tuples
appearing only in INSTRUCTOR will have a null for the Advisor attribute.
 A tuple that exists in both relations will have values for all its attributes.
Division Operation:
 The division is a binary operation that is written as R÷S.
 The result consists of the restrictions of tuples in R to the attribute names unique to R. for example,
see the tables COMPLETED, DBPROJECT and their division:

5
 If DBPROJECT contains all the tasks of the database project, then the result of the division above
contains exactly the students who have completed both the task in the database project.
 more formally, it is defined as:
R÷S = {t[a1,….an]: t ϵ R ^ s ϵ S((t[a1,…an] U S) ϵ R)} where { a1,….an} is the set of attribute
names unique to R and t[a1,…an] is the restriction of t to this set.
 It is usually required that the attribute names in the header of S are subset of those of R because
otherwise the result of the operation will always be empty.
Examples of Queries in Relational Algebra: see from book.
A complete set of Relational Algebra Operations: The set of relational algebra operations {σ, π, ∪, ρ, –,
×} is a complete set because any of the relational algebra operations can be expressed as a sequence of
operations from this set. For example, the INTERSECTION operation can be expressed as follows:
R ∩ S = (R ∪ S) – ((R – S) ∪ (S – R))
Similarly, a NATURAL JOIN can be specified as a CARTESIAN PRODUCT preceded by RENAME and
followed by SELECT and PROJECT operations.
Relational Calculus:
 Relational calculus is an alternative to relational algebra.
 In relational calculus, we write one declarative expression to specify a retrieval request; hence, there
is no description of how to evaluate a query.
 It specifies what is to be retrieved rather than how to retrieve it. Therefore, it is considered to be a
nonprocedural language.
 Any retrieval that can be specified in the basic relational algebra can also be specified in relational
calculus, and vice-versa, i.e., the expressive power of the two languages is identical.
 One variant is called Tuple Relational Calculus (TRC) and another is called Domain Relational
Calculus (DRC).
 Relational Algebra and Calculus are the foundation of query languages like SQL.
 Both are closed languages: results of queries on relations are relations.
Tuple Relational calculus:
 The tuple relational calculus is based on specifying a number of tuple variables.
 Variables in expressions represent a row of a relation (tuple variable).
 A simple tuple relational calculus query is of the form: {t| COND(t)} where t is a tuple variable and
COND(t) is a conditional expression involving t. the result of such a query is the set of all tuples t
that satisfy COND(t).
 Informally, we need to specify the following information in a tuple calculus expression:
 A set of attributes to be retrieved, the requested attributes. E.g. t.Fname, t.Lname
 For each tuple variable t, the range relation R. e.g. t ϵ EMPLOYEE
 A condition to select particular combination of tuples. E.g. t.Salary>50000
Examples of Queries in Tuple Relational (Basic operations only)
 Find all employees whose salary is above $50,000.
{t | t ϵ EMPLOYEE ^ t.Salary>50000}
 To retrieve only the first and last names of all employees whose salary is above $50,000:
{t.Fname, t.Lname | t ϵ EMPLOYEE ^ t.Salary > 50000}
 Retrieve the birth date and address of the employees whose name is ‘Hari Bahadur Karki’.
{t.Bdate, t.Address | t ϵ EMPLOYEE ^ t.Fname=’Hari’ ^ t.Mname=’Bahadur’ ^ t.Lname=’Karki’}
 List the name and address of all employees who work for the ‘Research’ department.
6
{t.Fname, t.Lname, t.Address | t ϵ EMPLOYEE ^ (∃d)(DEPARTMENT(d) ^ d.Dname=’Research’ ^
d.Dnumber=t.Dno)}
 Retrieve the employee’s first and last name and the first and last name of his/her immediate
supervisor.
{e.Fname, e.Lname, s.Fname, s.Lname | e ϵ EMPLOYEE ^ s ϵ EMPLOYEE ^ e.Super_SSN=s.SSN}
 Return tuples with Name from Author who has article on ‘DBMS’.
{t.Name | t ϵ Author ^ t.Article=’DBMS’}
 A tuple variable t is bound if it appears in an (∃t) or (t) clause; otherwise, it is free.
 The (∃) quantifier is called an existential quantifier because a formula (∃t)F is true if there exists
some tuple that makes F true.
 The () quantifier is called the universal or for all quantifier because every tuple in the universe of
tuples must make F true to make the quantified formula (t) F true.
 Find all students who have taken all courses offered in the Biology department:
{t| ∃r ϵ STUDENT (r.ID=t.ID ^ (u ϵ COURSE(u.Dept_Name=’Biology’  ∃s ϵ TAKES(t.ID=s.ID
^ s.CourseID = u.CourseID))}
Safe Expressions:
 A safe expression in relational calculus is one that is guaranteed to yield a finite number of tuples as
its result; otherwise, the expression is called unsafe. For example, the expression:
{t| NOT (EMPLOYEE(t))} is unsafe because it yields all tuples in the universe that are not
EMPLOYEE tuples, which are infinitely numerous.
Domain Relational Calculus (Basic Operations only)
 There is another type of relational calculus called the domain relational calculus, or simply, domain
calculus.
 The formal specification of the domain calculus was proposed after the development of the QBE
(Query-By-Example) system.
 Domain calculus differs from tuple calculus in the type of variables used in formulas; rather than
having variables range over tuples, the variables range over single values from domains of attributes.
 To form a relation of degree n for a query result, we must have n of these domain variables-one for
each attribute.
 An expression of the domain calculus is of the form:
{x1,x2,….xn | COND (x1,x2,…,xn,xn+1,xn+2,…..,xn+m)} where x1,x2,….xn,xn+1,xn+2,….xn+m
are domain variables that range over domains of attributes and COND is a condition or formula of
the domain relational calculus.
Examples:
 Find the ID, Name, Dept_Name and Salary for employees whose salary is greater than 20,000:
{< i, n, d, s > | < i, n, d, s > ϵ EMPLOYEE ^ s > 20000}
 Find all employee name for employees whose salary is greater than 20,000:
{<n> | ∃ i,d,s (< i,n,d,s> ϵ EMPLOYEE ^ s > 20000)}
 Names of all customers:

{< l, f > | ∃ m, a, t (< m, f, l, a, t> ϵ CUSTOMER)}


(Table: CUSTOMER(Mem_ID, First_Name, Last_Name, Address, Telephone))

{< m, l, f > | ∃ a, t (< m, f, l, a, t> ϵ CUSTOMER ^ f=”Ram”)}


 All customers named Ram:

 All movies by George Lucas from 1999 or later:

{< i, t > | ∃ c, y, d, p, l (< i, t, c, y, d, p, l> ϵ MOVIE ^ d=”Lucas” ^ y>=1999)}


(Table: MOVIE(ID, Title, Category, Year, Director, Price, Length))

 All tapes and their corresponding movie:

{< ti, t > | ∃ tf, mi (< ti, tf, mi> ϵ TAPE ^ ∃ i, t, c, y, d, p, l (< i, t, c, y, d, p, l> ϵ MOVIE ^
(Table: TAPE(TID,TFormat, Movie_ID))

mi=i)}

You might also like