Module2 DBMS (Part2)
Module2 DBMS (Part2)
Relational Algebra: Unary and Binary relational operations, additional relational operations
(aggregate, grouping, etc.), Examples of Queries in relational algebra.
Mapping Conceptual Design into a Logical Design: Relational Database Design using ER-to-
Relational mapping
Relational Algebra:
SELECT
The select operator sigma is used to select a subset of the tuples from a relation based on
selection condition. The selection condition acts as a filter that keeps only those tuples that
satisfy a qualifying condition. Alternatively, we can consider the SELECT operation to restrict
the tuples in a relation to only those tuples that satisfy the condition.
The SELECT operation can also be visualized as a horizontal partition of the relation into two
sets of tuples those tuples that satisfy the condition and are selected, and those tuples that do not
satisfy the condition and are discarded.
where,
where
or
Clauses can be connected by the standard Boolean operators and, or, and not to form a general
selection condition
Examples:
The Boolean conditions AND, OR, and NOT have their normal interpretation, as follows:
- (cond1 AND cond2) is TRUE if both (cond1) and (cond2) are TRUE; otherwise, it is FALSE.
- (cond1 OR cond2) is TRUE if either (cond1) or (cond2) or both are TRUE; otherwise, it is
FALSE.
The SELECT operator is unary; that is, it is applied to a single relation. The degree of the
relation resulting from a SELECT operation is the same as the degree of R
The PROJECT operation is denoted by selects certain columns from the table and
discards the other columns. Used when we are interested in only certain attributes of a relation.
The result of the PROJECT operation can be visualized as a vertical partition of the relation into
two relations:
- one has the needed columns (attributes) and contains the result of the operation
<attribute list>(R)
where
The result of the PROJECT operation has only the attributes specified in <attribute list> in the
same order as they appear in the list. Hence, its degree is equal to the number of attributes in
<attribute list>
Example:
1. To list each and last name and salary we can use the PROJECT operation as follows:
The result of the PROJECT operation is a set of distinct tuples, and hence a valid relation. This is
known as duplicate elimination.
For most queries, we need to apply several relational algebra operations one after the other.
Either we can write the operations as a single relational algebra expression by nesting the
operations, or we can apply one operation at a time and create intermediate result relations. In the
latter case, we must give names to the relations that hold the intermediate results.
For example, to retrieve the first name, last name, and salary of all employees who work in
department number 5, we must apply a SELECT and a PROJECT operation. We can write a
single relational algebra expression, also known as an in-line expression, as follows:
Alternatively, we can explicitly show the sequence of operations, giving a name to each
intermediate relation, as follows:
We can also use this technique to rename the attributes in the intermediate and result relations.
To rename the attributes in a relation, we simply list the new attribute names in parentheses.
Relational Algebra Operations from Set Theory
UNION: The result of this operation, denoted by R S, is a relation that includes all tuples that are
either in R or in S or in both R and S. Duplicate tuples are eliminated.
INTERSECTION: The result of this operation, denoted by R S, is a relation that includes all
tuples that are in both R and S.
SET DIFFERENCE (or MINUS): The result of this operation, denoted by R S, is a relation
that includes all tuples that are in R but not in S.
Example: Consider the the following two relations: STUDENT & INSTRUCTOR
The CARTESIAN PRODUCT (CROSS PRODUCT) Operation
The CARTESIAN PRODUCT operation also known as CROSS PRODUCT or CROSS JOIN
denoted by * is a binary set operation, but the relations on which it is applied do not have to be
union compatible. This set operation produces a new element by combining every member
(tuple) from one relation (set) with every member (tuple) from the other relation (set). Ex:
Binary Relational Operations: JOIN and DIVISION
The result of the JOIN is a relation Q with n + m attributes Q(A1, A2, ..., An,B1, B2, ..., Bm in
that order. Q has one tuple for each combination of tuples one from R and one from S whenever
the combination satisfies the join condition. This is the main difference between CARTESIAN
PRODUCT and JOIN. In JOIN, only combinations of tuples satisfying the join condition appear
in the result, whereas in the CARTESIAN PRODUCT all combinations of tuples are included in
the result. The join condition is specified on attributes from the two relations R and S and is
evaluated for each combination of tuples.
Each tuple combination for which the join condition evaluates to TRUE is included in the
resulting relation Q as a single combined tuple. A general join condition is of the form
The most common use of JOIN involves join conditions with equality comparisons only. Such a
JOIN, where the only comparison operator used is =, is called an EQUIJOIN.In the result of an
EQUIJOIN we always have one or more pairs of attributes that have identical values in every
tuple.
For example the values of the attributes Mgr_ssn and Ssn are identical in every tuple of
DEPT_MGR (the EQUIJOIN result) because the equality join condition specified on these two
attributes requires the values to be identical in every tuple in the result.
The standard definition of NATURAL JOIN requires that the two join attributes (or each pair of
join attributes) have the same name in both relations. If this is not the case, a renaming operation
is applied first. Suppose we want to combine each PROJECT tuple with the DEPARTMENT
tuple that controls the project.first we rename the Dnumber attribute of DEPARTMENT to
Dnum so that it has the same name as the Dnum attribute in PROJECT and then we apply
NATURAL JOIN:
The attribute Dnum is called the join attribute for the NATURAL JOIN operation, because it is
the only attribute with the same name in both relations.
A Complete Set of Relational Algebra Operations
The DIVISION operation, denoted by ÷, is useful for a special kind of query that sometimes
occurs in database applications.
Additional Relational Operations
Generalized Projection
The generalized projection operation extends the projection operation by allowing functions of
attributes to be included in the projection list. The generalized form can be expressed as:
Where F1, F2, ..., Fn are functions over the attributes in relation R and may involve arithmetic
operations and constant values. The generalized projection helpful when developing reports were
computed values have to be produced in the columns of a query result. For example, consider the
relation EMPLOYEE (Ssn, Salary,Deduction, Years_service). A report may be required to show
Aggregate functions are used in simple statistical queries that summarize information from the
database tuples. Common functions applied to collections of numeric values include SUM,
AVERAGE, MAXIMUM, and MINIMUM. The COUNT function is used for counting tuples or
values. Example, retrieving the average or total salary of all employees or the total number of
employee tuples
Grouping the tuples in a relation by the value of some of their attributes and then applying an
aggregate function independently to each group. For example , group EMPLOYEE tuples by
Dno, so that each group includes the tuples for employees working in the same department. We
can then list each Dno value along with, say, the average salary of employees within the
department, or the number of employees who work in the department.
Where,
Example: To retrieve each department number, the number of employees in the department, and
their average salary, while renaming the resulting attributes.
Recursive closure operation is applied to a recursive relationship between tuples of the same
type.
To retrieve SSN of all employees directly supervised by ‘Elsa David’ (at level 1)
a)
Output:
b)
Output:
c)
Output:
A set of operations, called outer joins, were developed for the case where the user wants to keep
all the tuples in R, or all those in S, or all those in both relations in the result of the JOIN,
regardless of whether or not they have matching tuples in the other relation. For example,
suppose that we want a list of all employee names as well as the name of the departments they
manage if they happen to manage a department; if they do not manage one, we can indicate it
with a NULL value. We can apply an operation LEFT OUTER JOIN, denoted by to
retrieve the result as follows:
The LEFT OUTER JOIN operation keeps every tuple in the first, or left, relation R; if no
matching tuple is found in S, then the attributes of S in the join result are filled or padded with
NULL values.
A similar operation, RIGHT OUTER JOIN keeps every tuple in the second, or right, relation S.
A third operation, FULL OUTER JOIN, keeps all tuples in both the left and the right relations
when no matching tuples are found, padding them with NULL values as needed.
The OUTER UNION operation was developed to take the union of tuples from two relations that
have some common attributes, but are not union (type) compatible.
Two tuples t1 in R and t2 in S are said to match if t1[X]= t2[X]. These will be combined
(unioned) into a single tuple in t. Tuples in either relation that have no matching tuple in the
other relation are padded with NULL values.
For example, an OUTER UNION can be applied to two relations whose schemas are:
Tuples from the two relations are matched based on having the same combination of values of
the shared attributes Name, Ssn, Department. All the tuples from both relations are included in
the result, but tuples with the same (Name, Ssn, Department) combination will appear only once
in the result. Tuples appearing only in STUDENT will have a NULL for the Rank attribute,
whereas tuples appearing only in INSTRUCTOR will have a NULL for the Advisor attribute.
A tuple that exists in both relations, which represent a student who is also an instructor, will have
values for all its attributes. The resulting relation, STUDENT_OR_INSTRUCTOR, will have the
following attributes:
For each regular entity type, create a relation R that includes all the simple attributes of E
If multiple keys were identified for E during the conceptual design, the information describing
the attributes that form each additional key is kept in order to specify secondary (unique) keys of
relation R
The relations that are created from the mapping of entity types are called entity relations because
each tuple represents an entity instance.
For each weak entity type, create a relation R and include all simple attributes of the entity type
as attributes of R
In our example, we create the relation DEPENDENT in this step to correspond to the weak entity
type DEPENDENT
We include the primary key Ssn of the EMPLOYEE relation which corresponds to the owner
entity type as a foreign key attribute of DEPENDENT; we rename it as Essn
It is common to choose the propagate (CASCADE) option for the referential triggered action on
the foreign key in the relation corresponding to the weak entity type, since a weak entity has an
existence dependency on its owner entity.
This can be used for both ON UPDATE and ON DELETE.
For each binary 1:1 relationship type R in the ER schema, identify the relations S and T that
correspond to the entity types participating in R
Choose one of the relations S, say and include as a foreign key in S the primary key of T.
Include all the simple attributes (or simple components of composite attributes) of the 1:1
relationship type R as attributes of S.
In our example, we map the 1:1 relationship type by choosing the participating entity type
DEPARTMENT to serve in the role of S because its participation in the MANAGES relationship
type is total
We include the primary key of the EMPLOYEE relation as foreign key in the DEPARTMENT
relation and rename it Mgr_ssn.
We also include the simple attribute Start_date of the MANAGES relationship type in the
DEPARTMENT relation and rename it Mgr_start_date
2. Merged relation approach: merge the two entity types and the relationship into a single
relation. This is possible when both participations are total, as this would indicate that the two
tables will have the exact same number of tuples at all times.
3. Cross-reference or relationship relation approach: set up a third relation R for the purpose
of cross-referencing the primary keys of the two relations S and T representing the entity types.
required for binary M:N relationships
The relation R is called a relationship relation (or sometimes a lookup table), because each tuple
in R represents a relationship instance that relates one tuple from S with one tuple from T
The relation R will include the primary key attributes of S and T as foreign keys to S and T.
The primary key of R will be one of the two foreign keys, and the other foreign key will be a
unique key of R.
The drawback is having an extra relation, and requiring an extra join operation when combining
related tuples from the tables.
For each regular binary 1:N relationship type R, identify the relation S that represents the
participating entity type at the N-side of the relationship type.
Include as foreign key in S the primary key of the relation T that represents the other entity type
participating in R
Include any simple attributes (or simple components of composite attributes) of the 1:N
relationship type as attributes of S
In our example, we now map the 1:N relationship types WORKS_FOR, CONTROLS, and
For SUPERVISION we include the primary key of the EMPLOYEE relation as foreign key in
the EMPLOYEE relation itself because the relationship is recursive and call it Super_ssn.
The CONTROLS relationship is mapped to the foreign key attribute Dnum of PROJECT, which
references the primary key Dnumber of the DEPARTMENT relation.
In our example, we map the M:N relationship type WORKS_ON by creating the relation
WORKS_ON.
We include the primary keys of the PROJECT and EMPLOYEE relations as foreign keys in
WORKS_ON and rename them Pno and Essn, respectively.
We also include an attribute Hours in WORKS_ON to represent the Hours attribute of the
relationship type.
The primary key of the WORKS_ON relation is the combination of the foreign key attributes
{Essn, Pno}.
The propagate (CASCADE) option for the referential triggered action should be specified on the
foreign keys in the relation corresponding to the relationship R, since each relationship instance
has an existence dependency on each of the entities it relates. This can be used for both ON
UPDATE and ON DELETE.
DEPARTMENT, while Dnumber as foreign key represents the primary key of the
DEPARTMENT relation.
A separate tuple will exist in DEPT_LOCATIONS for each location that a department has
The propagate (CASCADE) option for the referential triggered action should be specified on the
foreign key in the relation R corresponding to the multivalued attribute for both ON UPDATE
and ON DELETE.
Step 7: Mapping of N-ary Relationship Types
The primary key of S is usually a combination of all the foreign keys that reference the relations
representing the participating entity types.
For example, consider the relationship type SUPPLY. This can be mapped to the relation
SUPPLY whose primary key is the combination of the three foreign keys {Sname, Part_no,
Proj_name}
Question Bank
Module 2 - (Part 2)
1. Illustrate the relational algebra operators with examples for select and project operations.
2.
3. Explain the relational algebra operations from Set theory, with examples
4. Explain the ER to relational mapping algorithm with suitable example for each step.
5. Discuss equijoin and natural join with suitable example using relational algebra notation.
6. Expain the following unary operations with syntax and example
i) SELECT
ii) PROJECT
iii) RENAME
7. Explain the following binary operations with syntax and example
i) UNION
ii) INTERSECTION
iii) MINUS
iv) CROSS PRODUCT
v) DIVISION
8. Illustrate with an example, Aggregate Functions
9. Illustrate with an example, Recursive Closure Operations
10. What is union compatibility? Why do the UNION, INTERSECTION, and DIFFERENCE
operations require that the relations on which they are applied be union compatible
11. How are the OUTER JOIN operations different from the INNER JOIN operations?
12. How is the OUTER UNION operation different from UNION?
13. Illustrate with an example, significance of generalized projection.