0% found this document useful (0 votes)
2 views26 pages

Module2 DBMS (Part2)

Module 2 (Part 2) covers relational algebra operations including unary operations like SELECT and PROJECT, and binary operations such as JOIN and UNION. It explains how to map conceptual designs into logical designs using ER-to-relational mapping, detailing the creation of relations from entity types and relationships. Additional operations like OUTER JOIN and aggregate functions are also discussed, alongside examples of queries in relational algebra.

Uploaded by

manojpokemon0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views26 pages

Module2 DBMS (Part2)

Module 2 (Part 2) covers relational algebra operations including unary operations like SELECT and PROJECT, and binary operations such as JOIN and UNION. It explains how to map conceptual designs into logical designs using ER-to-relational mapping, detailing the creation of relations from entity types and relationships. Additional operations like OUTER JOIN and aggregate functions are also discussed, alongside examples of queries in relational algebra.

Uploaded by

manojpokemon0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Module 2 (Part 2)

Relational Algebra: Unary and Binary relational operations, additional relational operations
(aggregate, grouping, etc.), Examples of Queries in relational algebra.

Mapping Conceptual Design into a Logical Design: Relational Database Design using ER-to-
Relational mapping

Relational Algebra:

Unary Relational Operations: SELECT and PROJECT

SELECT

The select operator sigma is used to select a subset of the tuples from a relation based on
selection condition. The selection condition acts as a filter that keeps only those tuples that
satisfy a qualifying condition. Alternatively, we can consider the SELECT operation to restrict
the tuples in a relation to only those tuples that satisfy the condition.

The SELECT operation can also be visualized as a horizontal partition of the relation into two
sets of tuples those tuples that satisfy the condition and are selected, and those tuples that do not
satisfy the condition and are discarded.

In general, the select operation is denoted by

where,

- the symbol is used to denote the select operator

- the selection condition is a Boolean (conditional) expression specified on the attributes of


relation R

- tuples that make the condition true are selected

appear in the result of the operation


- tuples that make the condition false are filtered out

discarded from the result of the operation

The Boolean expression specified in <selection condition> is made up of a number of clauses of


the form:

where

<attribute name> <comparison op> <constant value>

or

<attribute name> <comparison op> <attribute name>

<attribute name> is the name of an attribute of R,

<comparison op> is one of the operators {=, <, >, and

<constant value> is a constant value from the attribute domain

Clauses can be connected by the standard Boolean operators and, or, and not to form a general
selection condition

Examples:

1. Select the EMPLOYEE tuples whose department number is 4.

2. Select the employee tuples whose salary is greater than $30,000

The result of a SELECT operation can be determined as follows:

The <selection condition> is applied independently to each individual tuple t in R


If the condition evaluates to TRUE, then tuple t is selected. All the selected tuples appear in the
result of the SELECT operation

The Boolean conditions AND, OR, and NOT have their normal interpretation, as follows:

- (cond1 AND cond2) is TRUE if both (cond1) and (cond2) are TRUE; otherwise, it is FALSE.

- (cond1 OR cond2) is TRUE if either (cond1) or (cond2) or both are TRUE; otherwise, it is
FALSE.

- (NOT cond) is TRUE if cond is FALSE; otherwise, it is FALSE

The SELECT operator is unary; that is, it is applied to a single relation. The degree of the
relation resulting from a SELECT operation is the same as the degree of R

The PROJECT Operation

The PROJECT operation is denoted by selects certain columns from the table and
discards the other columns. Used when we are interested in only certain attributes of a relation.
The result of the PROJECT operation can be visualized as a vertical partition of the relation into
two relations:

- one has the needed columns (attributes) and contains the result of the operation

- the other contains the discarded columns

The general form of the PROJECT operation is

<attribute list>(R)

where

(pi) - symbol used to represent the PROJECT operation,

<attributelist> - desired sublist of attributes from the attributes of relation R.

The result of the PROJECT operation has only the attributes specified in <attribute list> in the
same order as they appear in the list. Hence, its degree is equal to the number of attributes in
<attribute list>
Example:

1. To list each and last name and salary we can use the PROJECT operation as follows:

The result of the PROJECT operation is a set of distinct tuples, and hence a valid relation. This is
known as duplicate elimination.

Sequences of Operations and the RENAME Operation

For most queries, we need to apply several relational algebra operations one after the other.
Either we can write the operations as a single relational algebra expression by nesting the
operations, or we can apply one operation at a time and create intermediate result relations. In the
latter case, we must give names to the relations that hold the intermediate results.

For example, to retrieve the first name, last name, and salary of all employees who work in
department number 5, we must apply a SELECT and a PROJECT operation. We can write a
single relational algebra expression, also known as an in-line expression, as follows:

Alternatively, we can explicitly show the sequence of operations, giving a name to each
intermediate relation, as follows:

We can also use this technique to rename the attributes in the intermediate and result relations.
To rename the attributes in a relation, we simply list the new attribute names in parentheses.
Relational Algebra Operations from Set Theory

The UNION, INTERSECTION, and MINUS Operations

UNION: The result of this operation, denoted by R S, is a relation that includes all tuples that are
either in R or in S or in both R and S. Duplicate tuples are eliminated.

INTERSECTION: The result of this operation, denoted by R S, is a relation that includes all
tuples that are in both R and S.

SET DIFFERENCE (or MINUS): The result of this operation, denoted by R S, is a relation
that includes all tuples that are in R but not in S.

Example: Consider the the following two relations: STUDENT & INSTRUCTOR
The CARTESIAN PRODUCT (CROSS PRODUCT) Operation

The CARTESIAN PRODUCT operation also known as CROSS PRODUCT or CROSS JOIN
denoted by * is a binary set operation, but the relations on which it is applied do not have to be
union compatible. This set operation produces a new element by combining every member
(tuple) from one relation (set) with every member (tuple) from the other relation (set). Ex:
Binary Relational Operations: JOIN and DIVISION

The JOIN Operation

The result of the JOIN is a relation Q with n + m attributes Q(A1, A2, ..., An,B1, B2, ..., Bm in
that order. Q has one tuple for each combination of tuples one from R and one from S whenever
the combination satisfies the join condition. This is the main difference between CARTESIAN
PRODUCT and JOIN. In JOIN, only combinations of tuples satisfying the join condition appear
in the result, whereas in the CARTESIAN PRODUCT all combinations of tuples are included in
the result. The join condition is specified on attributes from the two relations R and S and is
evaluated for each combination of tuples.

Each tuple combination for which the join condition evaluates to TRUE is included in the
resulting relation Q as a single combined tuple. A general join condition is of the form

<condition> AND <condition> AND...AND <condition>


Variations of JOIN: The EQUIJOIN and NATURAL JOIN

The most common use of JOIN involves join conditions with equality comparisons only. Such a
JOIN, where the only comparison operator used is =, is called an EQUIJOIN.In the result of an
EQUIJOIN we always have one or more pairs of attributes that have identical values in every
tuple.

For example the values of the attributes Mgr_ssn and Ssn are identical in every tuple of
DEPT_MGR (the EQUIJOIN result) because the equality join condition specified on these two
attributes requires the values to be identical in every tuple in the result.

The standard definition of NATURAL JOIN requires that the two join attributes (or each pair of
join attributes) have the same name in both relations. If this is not the case, a renaming operation
is applied first. Suppose we want to combine each PROJECT tuple with the DEPARTMENT
tuple that controls the project.first we rename the Dnumber attribute of DEPARTMENT to
Dnum so that it has the same name as the Dnum attribute in PROJECT and then we apply
NATURAL JOIN:

The attribute Dnum is called the join attribute for the NATURAL JOIN operation, because it is
the only attribute with the same name in both relations.
A Complete Set of Relational Algebra Operations

Similarly, a NATURAL JOIN can be specified as a CARTESIAN PRODUCT proceeds by


RENAME and followed by SELECT and PROJECT operations. Hence, the various JOIN
operations are also not strictly necessary for the expressive power of the relational algebra.

The DIVISION Operation

The DIVISION operation, denoted by ÷, is useful for a special kind of query that sometimes
occurs in database applications.
Additional Relational Operations

Generalized Projection

The generalized projection operation extends the projection operation by allowing functions of
attributes to be included in the projection list. The generalized form can be expressed as:

ℼ F1, F2, ..., Fn (R)

Where F1, F2, ..., Fn are functions over the attributes in relation R and may involve arithmetic
operations and constant values. The generalized projection helpful when developing reports were
computed values have to be produced in the columns of a query result. For example, consider the
relation EMPLOYEE (Ssn, Salary,Deduction, Years_service). A report may be required to show

Net Salary = Salary Deduction,


Bonus = 2000 * Years_service, and

Tax = 0.25 * Salary.

Generalized projection combined with renaming:

Aggregate Functions and Grouping

Aggregate functions are used in simple statistical queries that summarize information from the
database tuples. Common functions applied to collections of numeric values include SUM,
AVERAGE, MAXIMUM, and MINIMUM. The COUNT function is used for counting tuples or
values. Example, retrieving the average or total salary of all employees or the total number of
employee tuples

Grouping the tuples in a relation by the value of some of their attributes and then applying an
aggregate function independently to each group. For example , group EMPLOYEE tuples by
Dno, so that each group includes the tuples for employees working in the same department. We
can then list each Dno value along with, say, the average salary of employees within the
department, or the number of employees who work in the department.

Where,

<grouping attributes> : list of attributes of the relation specified in R

<function list> : list of (<function> <attribute>) pairs.

<function> - such as SUM, AVERAGE, MAXIMUM, MINIMUM,COUNT

<attribute> is an attribute of the relation specified by R


The resulting relation has the grouping attributes plus one attribute for each element in the
function list.

Example: To retrieve each department number, the number of employees in the department, and
their average salary, while renaming the resulting attributes.

Recursive Closure Operations

Recursive closure operation is applied to a recursive relationship between tuples of the same
type.
To retrieve SSN of all employees directly supervised by ‘Elsa David’ (at level 1)

a)

Output:

b)

Output:
c)

Output:

OUTER JOIN Operations

A set of operations, called outer joins, were developed for the case where the user wants to keep
all the tuples in R, or all those in S, or all those in both relations in the result of the JOIN,
regardless of whether or not they have matching tuples in the other relation. For example,
suppose that we want a list of all employee names as well as the name of the departments they
manage if they happen to manage a department; if they do not manage one, we can indicate it
with a NULL value. We can apply an operation LEFT OUTER JOIN, denoted by to
retrieve the result as follows:
The LEFT OUTER JOIN operation keeps every tuple in the first, or left, relation R; if no
matching tuple is found in S, then the attributes of S in the join result are filled or padded with
NULL values.

A similar operation, RIGHT OUTER JOIN keeps every tuple in the second, or right, relation S.

A third operation, FULL OUTER JOIN, keeps all tuples in both the left and the right relations
when no matching tuples are found, padding them with NULL values as needed.

The OUTER UNION Operation

The OUTER UNION operation was developed to take the union of tuples from two relations that
have some common attributes, but are not union (type) compatible.

Two tuples t1 in R and t2 in S are said to match if t1[X]= t2[X]. These will be combined
(unioned) into a single tuple in t. Tuples in either relation that have no matching tuple in the
other relation are padded with NULL values.

For example, an OUTER UNION can be applied to two relations whose schemas are:

STUDENT(Name, Ssn, Department, Advisor)

INSTRUCTOR(Name, Ssn, Department, Rank)

Tuples from the two relations are matched based on having the same combination of values of
the shared attributes Name, Ssn, Department. All the tuples from both relations are included in
the result, but tuples with the same (Name, Ssn, Department) combination will appear only once
in the result. Tuples appearing only in STUDENT will have a NULL for the Rank attribute,
whereas tuples appearing only in INSTRUCTOR will have a NULL for the Advisor attribute.

A tuple that exists in both relations, which represent a student who is also an instructor, will have
values for all its attributes. The resulting relation, STUDENT_OR_INSTRUCTOR, will have the
following attributes:

STUDENT_OR_INSTRUCTOR(Name, Ssn, Department, Advisor, Rank)


Examples of Queries in Relational Algebra
Mapping Conceptual Design into a Logical Design:

Relational Database Design using ER-to-Relational mapping

Procedure to create a relational schema from an Entity-Relationship (ER)

Step 1: Mapping of Regular Entity Types

For each regular entity type, create a relation R that includes all the simple attributes of E

Include only the simple component attributes of a composite attribute

Choose one of the key attributes of E as the primary key for R


If the chosen key of E is a composite, then the set of simple attributes that form it will together
form the primary key of R.

If multiple keys were identified for E during the conceptual design, the information describing
the attributes that form each additional key is kept in order to specify secondary (unique) keys of
relation R

In our example-COMPANY database, we create the relations EMPLOYEE, DEPARTMENT,


and PROJECT we choose Ssn, Dnumber, and Pnumber as primary keys for the relations
EMPLOYEE, DEPARTMENT, and PROJECT, respectively

The relations that are created from the mapping of entity types are called entity relations because
each tuple represents an entity instance.

Step 2: Mapping of Weak Entity Types

For each weak entity type, create a relation R and include all simple attributes of the entity type
as attributes of R

Include primary key attribute of owner as foreign key attributes of R

In our example, we create the relation DEPENDENT in this step to correspond to the weak entity
type DEPENDENT

We include the primary key Ssn of the EMPLOYEE relation which corresponds to the owner
entity type as a foreign key attribute of DEPENDENT; we rename it as Essn

The primary key of the DEPENDENT relation is the combination {Essn,Dependent_name},


because Dependent_name is the partial key of DEPENDENT

It is common to choose the propagate (CASCADE) option for the referential triggered action on
the foreign key in the relation corresponding to the weak entity type, since a weak entity has an
existence dependency on its owner entity.
This can be used for both ON UPDATE and ON DELETE.

Step 3: Mapping of Binary 1:1 Relationship Types

For each binary 1:1 relationship type R in the ER schema, identify the relations S and T that
correspond to the entity types participating in R

There are three possible approaches:

- foreign key approach

- merged relationship approach

- crossreference or relationship relation approach

1. The foreign key approach

Choose one of the relations S, say and include as a foreign key in S the primary key of T.

It is better to choose an entity type with total participation in R in the role of S

Include all the simple attributes (or simple components of composite attributes) of the 1:1
relationship type R as attributes of S.

In our example, we map the 1:1 relationship type by choosing the participating entity type

DEPARTMENT to serve in the role of S because its participation in the MANAGES relationship
type is total

We include the primary key of the EMPLOYEE relation as foreign key in the DEPARTMENT
relation and rename it Mgr_ssn.

We also include the simple attribute Start_date of the MANAGES relationship type in the
DEPARTMENT relation and rename it Mgr_start_date

2. Merged relation approach: merge the two entity types and the relationship into a single
relation. This is possible when both participations are total, as this would indicate that the two
tables will have the exact same number of tuples at all times.

3. Cross-reference or relationship relation approach: set up a third relation R for the purpose
of cross-referencing the primary keys of the two relations S and T representing the entity types.
required for binary M:N relationships
The relation R is called a relationship relation (or sometimes a lookup table), because each tuple
in R represents a relationship instance that relates one tuple from S with one tuple from T

The relation R will include the primary key attributes of S and T as foreign keys to S and T.

The primary key of R will be one of the two foreign keys, and the other foreign key will be a
unique key of R.

The drawback is having an extra relation, and requiring an extra join operation when combining
related tuples from the tables.

Step 4: Mapping of Binary 1:N Relationship Types

For each regular binary 1:N relationship type R, identify the relation S that represents the
participating entity type at the N-side of the relationship type.

Include as foreign key in S the primary key of the relation T that represents the other entity type
participating in R

Include any simple attributes (or simple components of composite attributes) of the 1:N
relationship type as attributes of S

In our example, we now map the 1:N relationship types WORKS_FOR, CONTROLS, and

SUPERVISION For WORKS_FOR we include the primary key Dnumber of the


DEPARTMENT relation as foreign key in the EMPLOYEE relation and call it Dno.

For SUPERVISION we include the primary key of the EMPLOYEE relation as foreign key in
the EMPLOYEE relation itself because the relationship is recursive and call it Super_ssn.

The CONTROLS relationship is mapped to the foreign key attribute Dnum of PROJECT, which
references the primary key Dnumber of the DEPARTMENT relation.

Step 5: Mapping of Binary M:N Relationship Types

For each binary M:N relationship type

 Create a new relation S


 Include primary key of participating entity types as foreign key attributes in S
 Include any simple attributes of M:N relationship type

In our example, we map the M:N relationship type WORKS_ON by creating the relation

WORKS_ON.

We include the primary keys of the PROJECT and EMPLOYEE relations as foreign keys in
WORKS_ON and rename them Pno and Essn, respectively.
We also include an attribute Hours in WORKS_ON to represent the Hours attribute of the
relationship type.

The primary key of the WORKS_ON relation is the combination of the foreign key attributes
{Essn, Pno}.

The propagate (CASCADE) option for the referential triggered action should be specified on the
foreign keys in the relation corresponding to the relationship R, since each relationship instance
has an existence dependency on each of the entities it relates. This can be used for both ON
UPDATE and ON DELETE.

Step 6: Mapping of Multivalued Attributes

For each multivalued attribute

 Create a new relation


 Primary key of R is the combination of A and K
 If the multivalued attribute is composite, include its simple components

In our example, we create a relation DEPT_LOCATIONS

The attribute Dlocation represents the multivalued attribute LOCATIONS of

DEPARTMENT, while Dnumber as foreign key represents the primary key of the
DEPARTMENT relation.

The primary key of DEPT_LOCATIONS is the combination of {Dnumber, Dlocation}

A separate tuple will exist in DEPT_LOCATIONS for each location that a department has

The propagate (CASCADE) option for the referential triggered action should be specified on the
foreign key in the relation R corresponding to the multivalued attribute for both ON UPDATE
and ON DELETE.
Step 7: Mapping of N-ary Relationship Types

For each n-ary relationship type R

Create a new relation S to represent R

Include primary keys of participating entity types as foreign keys

Include any simple attributes as attributes

The primary key of S is usually a combination of all the foreign keys that reference the relations
representing the participating entity types.

For example, consider the relationship type SUPPLY. This can be mapped to the relation

SUPPLY whose primary key is the combination of the three foreign keys {Sname, Part_no,
Proj_name}
Question Bank

Module 2 - (Part 2)

1. Illustrate the relational algebra operators with examples for select and project operations.
2.

3. Explain the relational algebra operations from Set theory, with examples
4. Explain the ER to relational mapping algorithm with suitable example for each step.
5. Discuss equijoin and natural join with suitable example using relational algebra notation.
6. Expain the following unary operations with syntax and example
i) SELECT
ii) PROJECT
iii) RENAME
7. Explain the following binary operations with syntax and example
i) UNION
ii) INTERSECTION
iii) MINUS
iv) CROSS PRODUCT
v) DIVISION
8. Illustrate with an example, Aggregate Functions
9. Illustrate with an example, Recursive Closure Operations
10. What is union compatibility? Why do the UNION, INTERSECTION, and DIFFERENCE
operations require that the relations on which they are applied be union compatible
11. How are the OUTER JOIN operations different from the INNER JOIN operations?
12. How is the OUTER UNION operation different from UNION?
13. Illustrate with an example, significance of generalized projection.

You might also like