0% found this document useful (0 votes)
171 views69 pages

Relational Algebra in Database

The document discusses two classes of data manipulation languages: procedural (low-level) languages like relational algebra that specify the order of operations, and nonprocedural (high-level) languages like SQL that specify what data is required without how to get it. SQL can be used standalone as a query language or embedded in a programming language. Relational algebra is a set of operations for specifying queries in the relational model.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
171 views69 pages

Relational Algebra in Database

The document discusses two classes of data manipulation languages: procedural (low-level) languages like relational algebra that specify the order of operations, and nonprocedural (high-level) languages like SQL that specify what data is required without how to get it. SQL can be used standalone as a query language or embedded in a programming language. Relational algebra is a set of operations for specifying queries in the relational model.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

DBMS Languages

Data Manipulation Language (DML)


Two classes of languages

Procedural (Low Level )


 User specifies what data is required and how to get those data
 Example: Relational Algebra.
 In RA, we specify the order in which the operations have to be
performed.

Nonprocedural (High Level) SQL can be


 User specifies what data is
required without specifying how used in a standalone
way (query language)
to get those data
 Example: SQL embedded in a
programming language
(host language)
Slide 2- 1
Relational
Algebra
Relational Algebra
Operations in RDBMS

Retrieval Update

Relational Algebra is a set of operations for


specifying retrieval requests (or queries) in
relational model

Relational algebra expression is a sequence of


relational algebra operations
Company Database
Select Operation(unary operation)

This operation selects a subset of tuples from a relation


that satisfy a selection condition.
Select is denoted by :  <selection condition>(R)
Examples : Select Operation
 Select the employees whose department number is 4:
 DNO = 4 (EMPLOYEE)

Select the employees whose salary is greater than $35,000


Print the projects
5.6 offered by department 5

Print the SSN of employees who work more than


10 hours on a project

Slid
e 5-
7
Select Operation
 (DNO = 4 AND Salary > 25000) OR (DNO = 5 AND Salary > 30000) (EMPLOYEE)

OUTPUT
Select Operation
 Selection condition is a Boolean expression specified on the
attributes of relation R
 It can include boolean operators AND, OR, NOT applied on
relational operators < , > , <= , >= , != , =

 Select  is commutative:
 <condition1>( < condition2> (R)) =  <condition2> ( < condition1> (R))
 Cascade of Select operations
<cond1>(< cond2> (<cond3>(R)) =  <cond1> AND < cond2> AND < cond3>(R)))

 (DNO = 4 AND Salary > 25000) OR (DNO = 5 AND Salary > 30000) (EMPLOYEE)
Project Operation (unary operation)
 It selects a subset of columns from the relation.
 denoted by <attribute list>R
It removes duplicate
tuples
Example:
The result of project
  LNAME, FNAME, SALARY (EMPLOYEE) is set of tuples
OUTPUT
Project Operation
OUTPUT
Example 1
  SALARY ( LNAME, FNAME, SALARY EMPLOYEE)

Example 2
  LNAME, FNAME, SALARY ( SALARY EMPLOYEE)
NOW
Project operation is not commutative WHAT
???
Project Operation

 Project operation is not commutative

  <list1> ( <list2> (R) ) =  <list1> (R) as long as <list2>


contains the attributes in <list1>

No of Tuples in the result of projection π <list>(R) is


less or equal to the number of tuples in R

If the list of attributes includes a key of R, then the


no of is equal to the no of tuples in R
5.6
Print the name and number of projects
offered by department 5

Slid
e 5-
13
Relational Algebra Expressions

Retrieve the first name, last name, and


salary of all employees who work in
department number 5

Single relational algebra expression:


  FNAME, LNAME, SALARY( DNO=5(EMPLOYEE))

Using intermediate relation:


• D5   DNO=5(EMPLOYEE)
• RESULT   FNAME, LNAME, SALARY (D5)
Example of applying multiple operations and
RENAME

FNAME, LNAME, SALARY( DNO=5(EMPLOYEE))

D5   DNO=5(EMPLOYEE)
R (First_name,Last_name,Salary)   Fname, Lname, Salary D5
Union (Binary Operation)
 The result of R  S, is a relation that includes all
tuples that are either in R or in S or in both R and S
 Duplicate tuples are eliminated

Type compatible (Union compatible)


• The two relations R and S must be Type compatible
• R and S must have same number of attributes
• Each pair of corresponding attributes must have same
or compatible domains


UNION Example
To retrieve the SSN of all employees who either
• work in department 5 or
• directly supervise an employee in department 5

D5_EMPS  DNO=5 (EMPLOYEE)


RESULT1   SSN(D5_EMPS)
RESULT2(SSN)  SUPERSSN(D5_EMPS)
RESULT  RESULT1  RESULT2
INTERSECTION And SET DIFFERENCE
(Binary Operations)

 INTERSECTION operation: the result of R  S, is a


relation that includes all tuples that are in both R and S

 SET DIFFERENCE operation: the result of R – S, is a


relation that includes all tuples that are in R but not in S

 Two relations R and S must be “type compatible”

19
RELATIONAL ALGEBRA OPERATIONS FROM
SET THEORY

 Both  and  are commutative operations


 R  S = S  R, and R  S = S  R

 Both and  can be treated as n-ary


operations
 R  (S  T) = (R  S)  T
 (R  S)  T = R  (S  T)

 Minus operation is not commutative


 R–S≠S–R
Example

Compatible relation
Student  Instructor

Student  Instructor

Student - Instructor Instructor - Student


CARTESIAN PRODUCT
 The result of Cartesian product of two relations
R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm) is given as:
Result(A1, A2, . . ., An, B1, B2, . . ., Bm)

 Let |R| = nR and |S| = nS , then |R x S|= nR * nS


 R and S may NOT be "type compatible”

Cross Product is a meaningful operation


only if it is followed by other operations
Problem: Retrieve a list of each female employee’s dependents
Problem: Retrieve a list of each female employee’s dependents

F   SEX=’F’(EMPLOYEE)
EmpNames   FNAME, LNAME, SSN (F)
Emp_DP  EmpNames x DEPENDENT
Actual_DP   SSN=ESSN(Emp_DP)
Result   FNAME, LNAME, DEPENDENT_NAME(Actual_DP)
JOIN(Binary Operation)

 JOIN denoted by ⋈ combines related tuples from


various relations

 JOIN combines CARTESIAN PRODECT and


SELECT into a single operation

 General form of a join operation on two relations


R(A1, A2, . . ., An) and S(B1, B2, . . ., Bm) is:

R ⋈ <join condition>S
25
Example: JOIN operation
Retrieve the name of the manager of each department

DEPT_MGR  DEPARTMENT ⋈MGRSSN=SSN EMPLOYEE

26
RENAME OPEARATION
Rename operator is denoted by  (rho)

D5   Fname, Lname, Salary ( DNO=5(EMPLOYEE))


 R (First_Name, Last_Name, Salary)(D5)

D5   DNO=5(EMPLOYEE)
R (First_name, Last_name, Salary)   Fname, Lname, Salary D5
RENAME OPEARATION
 S(R) rename the relation R to S
S

(B1, B2, …, Bn )(R) rename the attributes to B1, B2, …..Bn

S (B1, B2, …, Bn )(R) rename R to S & attributes to B1, …..Bn


S
Complete Set of Relational
Operations
 The set of operations including
 SELECT ,
 PROJECT  ,
 UNION ,
 DIFFERENCE - ,
 RENAME , and
 CARTESIAN PRODUCT X
is called a complete set because any relational
algebra expression can be expressed using these.

 R  S = (R  S ) – ((R - S)  (S - R))
 R ⋈ <join condition>S =  <join condition> (R X S)
29
Some properties of JOIN

 Consider the following JOIN operation:


 R(A1, A2, . . ., An) S(B1, B2, . . ., Bm)
R.Ai=S.Bj

 Result is a relation Q with degree n + m attributes:


 Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order.

 If R has nR tuples, and S has nS tuples, then no of


tuples in join result < nR * nS .

31
Equi-Join
 EQUIJOIN is a join condition that involves only equality
operator = .
 Example:
 Retrieve a list of each female employee’s
dependents
FEmp   SEX=’F’(EMPLOYEE)
E_DP  FEmp ⋈ SSN=ESSN DEPENDENT
Result   FNAME, LNAME, SSN,DEPENDENT_NAME
(E_DP)

32
This is EQUI -JOIN operation
Retrieve the name of the manager of each department

DEPT_MGR  DEPARTMENT ⋈MGRSSN=SSN EMPLOYEE


33
• For each employee,
5.6
print his project numbers
• For each employee, list the name of his projects
• For each employee, retrieve the employee name and
the name of his project

Slid
e 5-
34
Issue with Equijoin Operation
 You have to specify the join condition.
 Even if two cols in the joining tables have same name.

 Superfluous column

 Result of EQUIJOIN always have one or more pairs of


attributes that have identical values in every tuple.
NATURAL JOIN Operation

NATURAL JOIN operation (denoted by *) is used when


• the two join attributes, or
• each pair of corresponding join attributes
must have the same name in both relations
• If this is not the case, a renaming operation is applied first.

 NATURAL JOIN also get rid of the superfluous


attribute in an EQUIJOIN condition.

36
NATURAL JOIN Operation
 Example: Print location of each department
DEPT_LOCS  DEPARTMENT * DEPT_LOCATIONS
 Only attribute with the same name is DNUMBER

An implicit join condition is created based on this attribute:


DEPARTMENT.DNUMBER=DEPT_LOCATIONS.DNUMBER

37
Example: Natural Join
 Consider two Relations A B C D C D E

 R1(A,B,C,D) & R2(C,D,E)

 Natural Join R*S R1


 RES  R1(A,B,C,D) * R2(C,D,E)
 The implicit join condition R2

 R1.C=R2.C AND R1.D=R2.D


A B C D E
RES(A,B,C,D,E)
38
Theta-join
 The general case of JOIN operation is called a
Theta-join: R S
theta
 Theta is a boolean expression on the attributes of
R and S; for example:
 R.Ai<S.Bj AND (R.Ak=S.Bl OR R.Ap<S.Bq)

 Theta can have any comparison operators


{=,≠,<,≤,>,≥,}

39
Theta-join Example
For each Male employee, list his colleagues who
earn more than him. Retrieve only the first
name and salary.
M(Name, Sal)   FNAME, SALARY (  SEX=’M’ EMPLOYEE )
ECOL(CName, CSal)   FNAME, SALARY EMPLOYEE
R1  M ⋈ M.Sal < ECol.CSal ECol

40
Theta-join Example
For each Male employee list his colleagues who earn more
than him. Retrieve only the first name and salary.
M(Name, Sal)   FNAME, SALARY (  SEX=’M’ EMPLOYEE )
ECOL(CName, CSal)   FNAME, SALARY EMPLOYEE
R1  M ⋈ M.Sal < ECol.CSal ECol
Name Sal CName CSal
John 30000 Franklin 40000
John 30000 Jennifer 43000
John 30000 Ramesh 38000
John 30000 James 55000
Franklin 40000 Jennifer 43000
Franklin 40000 James 55000
Ramesh 38000 Franklin 40000
41
Ramesh 38000 Jennifer 43000
Ramesh 38000 James 55000
Theta-join
 For each Male employee, print the names of his peers
with the same salary
 E2( EMPLOYEE )
E2   FNAME, SALARY (  SEX=’M EMPLOYEE )
Res   E1.FNAME, E2.FNAME ( E1 ⋈E1.SSN != E2.SSN and
E1.Salary=E2.Salary E2)

42
Aggregate Functions
 Mathematical Aggregate Functions applied to
collections of numeric values include
 SUM, AVERAGE, MAXIMUM, and MINIMUM.
 COUNT function is used for counting tuples or values.

Examples:
Retrieve the average or total salary of all employees
Retrieve total number of employee tuples

44
 ℱMAX Salary (EMPLOYEE)
Aggregate  ℱMIN Salary (EMPLOYEE)
Functions ℱ  ℱSUM Salary, AVERAGE Salary (EMPLOYEE)
 ℱCOUNT SSN (EMPLOYEE)

COUNT (*) returns the no. of rows in the result of the


query (it counts without removing duplicates)
NULL values are discarded when aggregate functions
are applied to a particular column (attribute).

45
Using Grouping with Aggregation
 Grouping can be combined with Aggregate Functions
 Example:
 For each department, retrieve the DNO, COUNT of
employees and AVERAGE SALARY
 DNO ℱCOUNT SSN, AVERAGE Salary EMPLOYEE

46
Grouping with Aggregation
DNO ℱCOUNT SSN, AVERAGE Salary EMPLOYEE

47
Grouping with Aggregation
DNO ℱCOUNT SSN, AVERAGE Salary EMPLOYEE

ℱCOUNT SSN, AVERAGE Salary EMPLOYEE

R(Dno, No_of_employees, Average_sal) (DNO ℱCOUNT SSN, AVERAGE Salary EMPLOYEE )

48
Examples of Queries in RA
Retrieve the name and address of all employees who work for
the ‘Research’ department.

RESEARCH_DEPT   DNAME=’Research’ (DEPARTMENT)


RESEARCH_EMPS  (RESEARCH_DEPT ⋈ DNUMBER= DNO EMPLOYEE)
49
RESULT   FNAME, LNAME, ADDRESS (RESEARCH_EMPS)
EXAMPLE: Retrieve the names of employees who have no
dependents.
ALL_EMPS   SSN(EMPLOYEE)
EMPS_WITH_DEPS(SSN)   ESSN(DEPENDENT)
EMPS_WITHOUT_DEPS  (ALL_EMPS - EMPS_WITH_DEPS)
RESULT   LNAME, FNAME (EMPS_WITHOUT_DEPS * EMPLOYEE)

50
EXAMPLE: RETRIEVE THE NAMES OF ALL EMPLOYEES
WITH TWO OR MORE DEPENDENTS.
T1(Ssn, No_of_dependents)  Essn ℱ COUNT Dependent_name (DEPENDENT)
T2   No_of_dependents >1(T1)
RESULT   LNAME, FNAME (T2 * EMPLOYEE)

51
Outer Join Operation
 In INNER JOIN, tuples without a matching are
eliminated from the join result
 Tuples with null are also eliminated
 This amounts to loss of information.

OUTER joins operations are used when we want to


keep
• all the tuples in R in the join result , or
• all the tuples in S in the join result, or
• all tuples in both relations R and S in the join
result 52
Left Outer Join
 List the employees name and the department name
that they manage. If they don’t manage one, then
indicate this with a null value.
 Temp  (Employee Ssn=Mgr_Ssn Department)
 Result   Fname, Minit, Lname, Dname(Temp)

53
Left Outer Join
 List the employees name and the department name
that they manage. If they don’t manage one, then
indicate this with a null value.
 Temp  (Employee Ssn=Mgr_Ssn Department)
 Result   Fname, Minit, Lname, Dname(Temp)

54
Right Outer Join
 List the employees name and the department name that
they manage. If they don’t manage one, then indicate this
with a null value.
 Temp  (Department Mgr_Ssn= Ssn Employee)
 Result   Fname, Minit, Lname, Dname(Temp)

55
Full Outer Join
List the employees name and the department name that they
manage. If they don’t manage one or the department have
no manager, then indicate this with a null value.

Temp  Employee_Ssn= Mgr_Ssn Department


Result   Fname, Lname, Dname(Temp)
Full Outer Join vs Cartesian Product

What is the difference ?


OR …
are they same … ?
Outer Join Operation
 Left outer join: keeps every tuple in R, denoted
as R S
 if no matching tuple is found in S, then the
attributes of S in the join result are filled with null
values.

 Right outer join: keeps every tuple in S in the


result of R S.

 Full outer join: keeps all tuples in both the left


and the right relations. It is denoted by

58
Inner and Outer Joins

Slid
e 8-
59
Another Example Outer Join

List the employees name and the Project name that they
work on. If they don’t work on any project or a project
have no employee working on it, then indicate this with
a null value.
Yet another Example
Find SSN of employees who work on all the projects of
Dnum= 4

 PD4(Pno)   Pnumber ( DNUM=4 Project)

 Ssn_Pnos   Essn,Pno (Works_on)

 SSNS(ssn) Ssn_Pnos ??? PD4


61

DIVISION
Yet an other Example
Find SSN of employees who work on all the projects of
Dnum= 4

PD4
Pno
10
30

 PD4(Pno)   Pnumber ( DNUM=4 Project)

 Ssn_Pnos   Essn,Pno (Works_on)

 SSNS(ssn) Ssn_Pnos  PD4


62

DIVISION
DIVISION (Binary Operation)
 Division operation is applied to two relations R1 and R2
R1(Attr_R1)  R2(Attr_R2)
 where Attr_R2  Attr_R1.

Let Result = R1  R2
 Attr_Result = Attr_R1 - Attr_R2
 Attr_Result is a set of attributes of R1
that are not the attributes of R2.

 For a tuple to appear in the result of the DIVISION, the values in t must63
appear in R1 in combination with every tuple in R2.
Example of DIVISION
Find SSN of employees who work on all the projects that
John Smith works on

 Smith   fname=‘John’ and lname=‘Smith’ (Employee)

 Smith_Pnos   Pno (Works_on essn=ssn Smith)

 Ssn_Pnos   Essn,Pno (Works_on)


64
 SSNS(ssn) Ssn_Pnos  Smith_Pnos
Simulation of Division operator

R S

 Temp  B( ( B(R) x S) – R)
 T   B(R) - Temp
Example
For every project located in ‘Stafford’, list the project no, the
controlling department no, and the department manager’s
last name, address, and birth date.

67
Example of Query Tree
For every project located in ‘Stafford’, list the project no, the
controlling department no, and the department manager’s
last name, address, and birth date.

Internal Nodes stand for operations


like selection, projection, join, division,
….

68
Leaf nodes represent base
relations
Query Tree

An internal data structure to represent a query

Standard technique to estimate the work done in


executing the query, and the optimization of execution

A tree gives a good visual feel of the complexity of the


query and the operations involved

Algebraic Query Optimization consists of rewriting the


query or modifying the query tree into an equivalent tree.
69
Recursive Closure Operation
 This can’t be specified in general using Relational Algebra
 Example: Retrieve all SUPERVISEES of an EMPLOYEE e at all
levels — that is,
 all employees e` directly supervised by e;
 all employees e`` directly supervised by each employee e`;
 all employees e```directly supervised by each employee e``;
 and so on.

We can retrieve employees at each level and then take their union,
however, we cannot specify a query such as

“retrieve the supervisees of ‘James Borg’ at all levels”


without utilizing a looping mechanism.
70

The SQL3 standard includes syntax for recursive closure.


Recursive Closure Operation

71
PRACTICE QUESTION
 Do example queries and the questions at the
end of Relational Algebra Chapter in

 Fundamentals of Database Systems (6th Edition),


Ramez Elmasri

 Database Systems: The Complete Book,


Hector Garcia-Molina, Jeffrey Ullman, Jennifer Widom

 Database Management Systems,


Raghu Ramakrishnan
Relational Algebra Operators
 Relational Algebra consists of several groups of operations
 Unary Relational Operations
 SELECT (symbol:  (sigma))
 PROJECT (symbol:  (pi))

 RENAME (symbol:  (rho))

 Relational Algebra Operations From Set Theory


 UNION (  ), INTERSECTION (  ), DIFFERENCE (–)
 CARTESIAN PRODUCT ( x )

 Binary Relational Operations


JOIN (several variations of JOIN exist)
SQL

 DIVISION

 Additional Relational Operations


 OUTER JOINS, OUTER UNION
 AGGREGATE FUNCTIONS (These compute summary of

information: for example, SUM, COUNT, AVG, MIN, MAX)

You might also like