0% found this document useful (0 votes)
14 views

Module 2

The document outlines the concepts of the Relational Model in Database Management Systems, covering relational model concepts, relational algebra, and the mapping of conceptual design into logical design. It details the structure of relations, attributes, tuples, and constraints, including domain, key, and referential integrity constraints. Additionally, it discusses update operations and the potential for constraint violations during data manipulation.

Uploaded by

utkarshk0804
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Module 2

The document outlines the concepts of the Relational Model in Database Management Systems, covering relational model concepts, relational algebra, and the mapping of conceptual design into logical design. It details the structure of relations, attributes, tuples, and constraints, including domain, key, and referential integrity constraints. Additionally, it discusses update operations and the potential for constraint violations during data manipulation.

Uploaded by

utkarshk0804
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 124

Database

Management
Systems (BCS403) –
2023-24
Module 2
USHA K PATIL
Department of CS&E
The National Institute of Engineering
Topics
Relational Model

• Relational Model Concepts, Relational Model Constraints


and relational database schemas, Update operations,
transactions, and dealing with constraint violations.

Relational Algebra

• Unary and Binary relational operations, additional


relational operations (aggregate, grouping, etc.)
Examples of Queries in relational algebra.

Mapping Conceptual Design into a Logical Design

• Relational Database Design using ER-to-Relational


mapping
Module 2 – Chapter 1
Relational Model Concepts
• The relational model represents the database as a
collection of relations.
• Informally, each relation resembles a table of values or,
to some extent, a flat file of records
• A relation is thought of as a table of values, each row in
the table represents a collection of related data values.
• A row represents a fact that typically corresponds to a
real-world entity or relationship.
• The table name and column names are used to help to
interpret the meaning of the values in each row.
Relational Model Concepts
• In the formal relational model terminology, a row  a
tuple, a column header an attribute, and the table a
relation.
• The data type describing the types of values that can
appear in each column is represented by a domain of
possible values.
Relational Model Concepts
Domains, Attributes, Tuples, and Relations
• A domain D is a set of atomic values.
• By atomic means each value in the domain is indivisible in formal
relational model.
• A common method of specifying a domain is to specify a data type
from which the data values forming the domain are drawn.
• Some examples of domains follow:
• USA_phone_number: string of digits of length ten
• SSN: string of digits of length nine
• Name: string of characters beginning with an upper-case letter
• GPA: a real number between 0.0 and 4.0
• Sex: a member of the set { female, male }
• Dept_Code: a member of the set { CMPS, MATH, ENGL, PHYS, PSYC, ... }
Relational Model Concepts
• A relation schema R, denoted by R(A1, A2, … , An), is made up
of a relation name R and a list of attributes, A1, A2, … , An.
• Attribute: Ai is the name of a role played by some domain D
in the relation schema R. D is called the domain of Ai and is
denoted by dom(Ai).
• Tuple: A tuple is a mapping from attributes to values drawn
from the respective domains of those attributes.
• A tuple is intended to describe some entity (or relationship
between entities) in the miniworld.
• R is called the name of this relation.
Relational Model Concepts
• The degree (or arity) of a relation is the number of attributes ‘n’ of its
relation schema.
• A relation of degree seven, which stores information about university
students, would contain seven attributes describing each student as
follows:
• STUDENT(Name, Ssn, Home_phone, Address, Office_phone, Age,
Gpa)
• Relational Database: A collection of relations, each one consistent
with its specified relational schema.
• A relation (or relation state) r of the relation schema R(A1, A2, … , An),
also denoted by r(R), is a set of n-tuples r = {t1, t2, … , tm}. Each n-
tuple t is an ordered list of n values t =<v1,v2…,vn>
Relational Model Concepts
Relational Model Concepts

Characteristics of Relations

1. Ordering of Tuples in a Relation


• A relation is defined as a set of tuples.
• Mathematically, elements of a set have no order among
them; hence, tuples in a relation do not have any
particular order.
• When tuples are represented on a storage device, they
must be organized in some fashion, and it may be
advantageous, from a performance standpoint, to
organize them in a way that depends upon their content.
2. Ordering of Values within a
Tuple
• The order of attributes and their
values is not that important as long
as the correspondence between
Relational attributes and values is maintained.
Model • A tuple can be considered as a set of
(<attribute>,<value> ) pairs, where
Concepts each pair gives the value of the
mapping from an attribute Ai to a
value vi from dom(Ai).
• The ordering of attributes is not
important, because the attribute
name appears with its value.
3. Values and NULLs in the Tuples
• Each value in a tuple is an atomic
value; that is, it is not divisible into
components.
Relational • An important concept is NULL values,
Model which are used to represent the
values of attributes that may be
Concepts unknown or may not apply to a tuple.
• NULL values has several meanings,
such as value unknown, value exists
but is not available, or attribute does
not apply to this tuple.
4. Interpretation (Meaning) of a
Relation
• Each tuple in the relation can then be
interpreted as a fact or a particular
instance of the assertion.
Relational • Each relation can be viewed as a
Model predicate and each tuple in that
relation can be viewed as an
Concepts assertion for which that predicate is
satisfied (i.e., has value true) for the
combination of values in it.
• Example: There exists a student
having name Benjamin Bayer, having
SSN 305-61-2435, having age 19, etc.
Relational Model Notation: The
following notation are used for
presentation:
• A relation schema R of degree n is
denoted by R(A1, A2, … , An).
Relational • The uppercase letters Q, R, S denote
Model relation names.
• The lowercase letters q, r, s denote
Concepts relation states.
• The letters t, u, v denote tuples.
• In general, the name of a relation
schema such as STUDENT also indicates
the current set of tuples in that relation—
the current relation state—whereas
STUDENT(Name, Ssn, …) refers only to
the relation schema.
• An attribute ‘A’ can be qualified with the
relation name ‘R’ to which it belongs by
using the dot notation R.A—for example,
STUDENT.Name or STUDENT.Age. This is
because the same name may be used for
Relational two attributes in different relations.
• An n-tuple ‘t’ in a relation r(R) is denoted by
Model t =<v1,v2, ….., vn> , where vi is the value
corresponding to attribute Ai. The following
Concepts notation refers to component values of
tuples:
• Both t[Ai] and t.Ai (and sometimes t[i]) refer
to the value vi in t for attribute Ai.
• Both t[Au, Aw, … , Az] and t.(Au, Aw, … , Az),
where Au, Aw, … , Az is a list of attributes
from R, refer to the subtuple of values from t
corresponding to the attributes specified in
the list.
Relational Model Constraints on databases can
generally be divided into three main
Relational categories:
1. Constraints that are inherent in the data
Model model known as inherent model-based
constraints or implicit constraints.
Constraint 2. Constraints that can be directly expressed
in the schemas of the data model,
s and typically by specifying them in the DDL is
known as schema-based constraints or
Relational explicit constraints.
Database 3. Constraints that cannot be directly
expressed in the schemas of the data
Schemas model, and hence must be expressed and
enforced by the application programs or
in some other way known as application-
based or semantic constraints or business
rules.
• The schema-based constraints include
domain constraints, key constraints,
Relational constraints on NULLs, entity integrity
constraints, and referential integrity
Model constraints.
Domain Constraints
Constraint • Domain constraints specify that within each
s and tuple, the value of each attribute A must
be an atomic value from the domain
Relational dom(A).
Database • The data types associated with domains
typically include standard numeric data
Schemas types for integers and real numbers.
Characters, Booleans, fixed-length
strings, and variable-length strings are also
available, as are date, time, timestamp,
and other special data types.
Key Constraints and Constraints on
NULL Values
Relational • In the formal relational model, a relation
is defined as a set of tuples.
Model • By definition, all elements of a set are
Constraint distinct; hence, all tuples in a relation
must also be distinct.
s and • This means that no two tuples can have
the same combination of values for all
Relational their attributes.
Database • Usually, there are other subsets of
attributes of a relation schema R with the
Schemas property that no two tuples in any
relation state r of R should have the same
combination of values for these
attributes.
• Suppose that we denote one such
subset of attributes by SK; then for
Relational any two distinct tuples t1 and t2 in a
relation state r of R, we have the
Model constraint that: t1[SK] ≠ t2[SK].
Constraint • A superkey SK specifies a
uniqueness constraint that no two
s and distinct tuples in any state r of R can
Relational have the same value for SK.
• A key k of a relation schema R is a
Database superkey of R with the additional
Schemas property that removing any attribute
A from K leaves a set of attributes K′
that is not a superkey of R any more.
• Hence, a key satisfies two
properties:
Relational 1. Two distinct tuples in any state of
Model the relation cannot have identical
values for (all) the attributes in
Constraint the key. This uniqueness property
also applies to a superkey.
s and
2. It is a minimal superkey—that is, a
Relational superkey from which we cannot
Database remove any attributes and still
have the uniqueness constraint
Schemas hold. This minimality property is
required for a key but is optional
for a superkey
Relational Databases and
Relational Database Schemas
Relational • A relational database is a collection
Model of many relations.
• A relational database schema S is a
Constraint set of relation schemas S = {R1, R2,
s and … , Rm} and a set of integrity
constraints IC.
Relational • A relational database state DB of S is
Database a set of relation states DB = {r1, r2,
…, rm} such that each ri is a state of
Schemas Ri and such that the ri relation states
satisfy the integrity constraints
specified in IC.
• A relational database schema that
we call COMPANY = {EMPLOYEE,
Relational DEPARTMENT, DEPT_LOCATIONS,
PROJECT, WORKS_ON,
Model DEPENDENT}.
Constraint • When we refer to a relational
database, we implicitly include both
s and its schema and its current state.
Relational • A database state that does not
Database obey all the integrity constraints is
called not valid, and a state that
Schemas satisfies all the constraints in the
defined set of integrity constraints
IC is called a valid state.
Relational Model
Constraints and
Relational
Database
Schemas
Entity Integrity, Referential Integrity,
and Foreign Keys
Relational • The entity integrity constraint states that no
primary key value can be NULL. This is
Model because the primary key value is used to
identify individual tuples in a relation.
Constraint • Key constraints and entity integrity
s and constraints are specified on individual
relations.
Relational • The referential integrity constraint is
specified between two relations and is used
Database to maintain the consistency among tuples in
the two relations.
Schemas • Informally, the referential integrity
constraint states that a tuple in one relation
that refers to another relation must refer to
an existing tuple in that relation.
• For example, the attribute Dno of
EMPLOYEE gives the department number
Relational for which each employee works; hence, its
value in every EMPLOYEE tuple must
Model match the Dnumber value of some tuple in
the DEPARTMENT relation.
Constraint • The conditions for a foreign key, given
below, specify a referential integrity
s and constraint between the two relation
schemas R1 and R2.
Relational • A set of attributes FK in relation schema
Database R1 is a foreign key of R1 that references
relation R2 if it satisfies the following rules:
Schemas 1. The attributes in FK have the same
domain(s) as the primary key attributes PK
of R2; the attributes FK are said to reference
or refer to the relation R2.
2. A value of FK in a tuple t1 of the current
state r1(R1) either occurs as a value of PK
Relational for some tuple t2 in the current state
r2(R2) or is NULL. In the former case, we
Model have t1[FK] = t2[PK], and we say that the
tuple t1 references or refers to the tuple t2.
Constraint • In this definition, R1 is called the
s and referencing relation and R2 is the
referenced relation. If these two
Relational conditions hold, a referential integrity
constraint from R1 to R2 is said to hold.
Database • In the EMPLOYEE relation, the attribute
Schemas Dno refers to the department for which
an employee works; hence, it is
designated Dno to be a foreign key of
EMPLOYEE referencing the DEPARTMENT
relation.
Relational • This means that a value of Dno in
any tuple t1 of the EMPLOYEE
Model relation must match a value of
Constraints the primary key of
Constraint DEPARTMENT—the Dnumber
s and attribute—in some tuple t2 of the
DEPARTMENT relation, or the value
Relational of Dno can be NULL if the employee
Database does not belong to a department or
will be assigned to a department
Schemas later.
Relational Model
Constraints and
Relational
Database
Schemas
Other types of constraints
• The salary of an employee should
Relational not exceed the salary of the
Model employee’s supervisor and the
maximum number of hours an
Constraint employee can work on all projects
per week is 56.
s and
• Such constraints can be specified
Relational and enforced within the application
Database programs that update the database,
or by using a general-purpose
Schemas constraint specification language.
Sometimes called as Semantic
Integrity constraint.
Update operations, Transactions, and Dealing with
Constraint Violations

• There are three basic operations that can


change the states of relations in the database:
• Insert, Delete, and Update (or Modify).
• Insert is used to insert one or more new
tuples in a relation.
• Delete is used to delete tuples.
• Update (or Modify) is used to change the
values of some attributes in existing tuples.
Update operations, Transactions, and Dealing with
Constraint Violations

The Insert Operation:


• The Insert operation provides a list of attribute values for a new
tuple t that is to be inserted into a relation R.
• Insert can violate any of the four types of constraints.
1. Domain constraints can be violated if an attribute value is given
that does not appear in the corresponding domain or is not of
the appropriate data type.
2. Key constraints can be violated if a key value in the new tuple t
already exists in another tuple in the relation r(R).
3. Entity integrity can be violated if any part of the primary key of
the new tuple t is NULL.
4. Referential integrity can be violated if the value of any foreign
key in t refers to a tuple that does not exist in the referenced
Update
operations,
Transactions,
and Dealing
with
Constraint
Violations
Update operations, Transactions, and Dealing with
Constraint Violations

• If an insertion violates one or more constraints, the


default option is to reject the insertion.
• Another option is to attempt to correct the reason
for rejecting the insertion, but this is typically not
used for violations caused by Insert; rather, it is
used more often in correcting violations for Delete
and Update.
Update operations, Transactions, and Dealing with
Constraint Violations

The Delete Operation


• The Delete operation can violate only referential
integrity.
• This occurs if the tuple being deleted is referenced by
foreign keys from other tuples in the database.
• To specify deletion, a condition on the attributes of the
relation selects the tuple (or tuples) to be deleted.
• Several options are available if a deletion operation
causes a violation. The first option, called restrict, is to
reject the deletion.
Update operations, Transactions, and Dealing with
Constraint Violations

• The second option, called cascade, is to attempt to


cascade (or propagate) the deletion by deleting
tuples that reference the tuple that is being deleted.
• For example, in operation 2, the DBMS could
automatically delete the offending tuples from
WORKS_ON with Essn = ‘999887777’.
• A third option, called set null or set default, is to
modify the referencing attribute values that cause the
violation; each such value is either set to NULL or
changed to reference another default valid tuple.
Update operations, Transactions, and Dealing with Constraint
Violations
Update operations, Transactions, and Dealing with
Constraint Violations

The Update Operation


• The Update (or Modify) operation is used to change the
values of one or more attributes in a tuple (or tuples) of
some relation R.
• It is necessary to specify a condition on the attributes of
the relation to select the tuple (or tuples) to be modified.
• Updating an attribute that is neither part of a primary
key nor part of a foreign key usually causes no problems;
the DBMS need only check to confirm that the new value
is of the correct data type and domain.
Update operations, Transactions, and Dealing with
Constraint Violations

• If a foreign key attribute is modified, the DBMS must


make sure that the new value refers to an existing
tuple in the referenced relation (or is set to NULL).
• When a referential integrity constraint is specified in
the DDL, the DBMS will allow the user to choose
separate options.
Update operations, Transactions, and Dealing with Constraint
Violations
Update operations, Transactions, and Dealing with
Constraint Violations

The Transaction Concept


• A transaction is an executing program that includes some
database operations, such as reading from the database, or
applying insertions, deletions, or updates to the database.
• At the end of the transaction, it must leave the database in a
valid or consistent state that satisfies all the constraints
specified on the database schema.
• A single transaction may involve any number of retrieval
operations and any number of update operations.
• These retrievals and updates will together form an atomic
unit of work against the database.
Update operations, Transactions, and Dealing with
Constraint Violations

• A large number of commercial applications running


against relational databases in online transaction
processing (OLTP) systems are executing
transactions at rates that reach several hundred per
second.
Module 2 – Chapter 2
Unary and Binary relational
operations
The SELECT Operation
• The SELECT operation is used to choose a subset of the tuples
from a relation that satisfies a ‘selection condition’.
• It restricts the tuples in a relation to only those tuples that
satisfy the condition.
• It can also be visualized as a horizontal partition of the relation
into two sets of tuples—those tuples that satisfy the condition
and are selected, and those tuples that do not satisfy the
condition and are discarded.
• For example, to select the EMPLOYEE tuples whose department
is 4, or those whose salary is greater than $30,000
Unary and Binary relational
operations
• In general, the SELECT operation is denoted by σ<selection
condition>(R)

• where the symbol σ (sigma) is used to denote the SELECT


operator and the selection condition is a Boolean expression
(condition) specified on the attributes of relation R.
• The Boolean expression specified in is made up of a number
of clauses of the form : <attribute name><comparison
op><constant value> Or <attribute name><comparison
op><attribute name>
• Clauses can be connected by the standard Boolean operators
and, or, and not to form a general selection condition.
Unary and Binary relational
operations
• σ<selection condition>(R)
• σDno=4(EMPLOYEE)
• σSalary>30000(EMPLOYEE)
• σ(Dno=4 AND Salary>25000) OR (Dno=5 AND Salary>30000)(EMPLOYEE)
• The Boolean conditions AND, OR, and NOT have their normal
interpretation, as follows:
• (cond1 AND cond2) is TRUE if both (cond1) and (cond2) are
TRUE; otherwise,it is FALSE.
• (cond1 OR cond2) is TRUE if either (cond1) or (cond2) or
both are TRUE; otherwise, it is FALSE.
• (NOT cond) is TRUE if cond is FALSE; otherwise, it is FALSE.
Unary and Binary relational
operations
• The SELECT operator is unary; that is, it is applied to a
single relation. Hence, selection conditions cannot
involve more than one tuple.
• The degree of the relation resulting from a SELECT
operation—its number of attributes—is the same as the
degree of R.
• The SELECT operation is commutative; that is,
σ (cond1) (σ(cond2)(R)) = σ(cond2)(σ(cond1)(R))
Unary and Binary relational
operations
The PROJECT Operation
• The PROJECT operation, selects certain columns from the
table and discards the other columns.
• The result of the PROJECT operation can be visualized as a
vertical partition of the relation into two relations: one has the
needed columns (attributes) and contains the result of the
operation, and the other contains the discarded columns.
• For example, to list each employee’s first and last name and
salary, we can use the PROJECT operation as follows:
πLname, Fname, Salary(EMPLOYEE)
Unary and Binary relational
operations
• The general form of the PROJECT operation is π<attribute list>(R)
• where (pi) is the symbol used to represent the PROJECT
operation, and is the desired sublist of attributes from the
attributes of relation R.
• The result of the PROJECT operation has only the attributes
specified in in the same order as they appear in the list.
Hence, its degree is equal to the number of attributes in
<attribute list>.
• The PROJECT operation removes any duplicate tuples, so the
result of the PROJECT operation is a set of distinct tuples, and
hence a valid relation. This is known as duplicate elimination.
Unary and Binary relational
operations
Sequences of Operations and the RENAME Operation
• The relations shown above depict operation results do not have any
names.
• Either we can write the operations as a single relational algebra
expression by nesting the operations, or we can apply one operation
at a time and create intermediate result relations.
• In the latter case, we must give names to the relations that hold the
intermediate results.
• For example, to retrieve the first name, last name, and salary of all
employees who work in department number 5, apply a SELECT and a
PROJECT operation.
πFname, Lname, Salary(σDno=5(EMPLOYEE))
Unary and Binary relational
operations
• Alternatively, we can explicitly show the sequence of
operations, giving a name to each intermediate relation,
and using the assignment operation, denoted by ← (left
arrow), as follows:
DEP5_EMPS ← σDno=5(EMPLOYEE)
RESULT ← πFname, Lname, Salary(DEP5_EMPS)
• It is sometimes simpler to break down a complex sequence
of operations by specifying intermediate result relations
than to write a single relational algebra expression.
• We can also use this technique to rename the attributes in
the intermediate and result relations.
Unary and Binary relational
operations
• To rename the attributes in a relation, we simply list the
new attribute names in parentheses, as in the following
example:
TEMP ← σDno=5(EMPLOYEE)
R(First_name, Last_name, Salary) ← πFname, Lname, Salary(TEMP)
• The formal RENAME operation—which can rename
either the relation name or the attribute names, or both
—as a unary operator.
Unary and Binary relational
operations
• The general RENAME operation when applied to a
relation R of degree n is denoted by any of the following
three forms: ρS(B1, B2, ... , Bn)(R) or ρS(R) or ρ(B1,
B2, ... , Bn)(R),
• where the symbol ρ (rho) is used to denote the RENAME
operator, S is the new relation name, and B1, B2, … , Bn
are the new attribute names.
• The first expression renames both the relation and its
attributes, the second renames the relation only, and
the third renames the attributes only.
Unary and Binary relational
operations
Relational Algebra Operations from Set Theory (The UNION,
INTERSECTION, and MINUS Operations)
• UNION: The result of this operation, denoted by R ∪ S, is a relation
that includes all tuples that are either in R or in S or in both R and
S. Duplicate tuples are eliminated.
• INTERSECTION: The result of this operation, denoted by R ∩ S, is a
relation that includes all tuples that are in both R and S.
• SET DIFFERENCE (or MINUS): The result of this operation, denoted
by R – S, is a relation that includes all tuples that are in R but not
in S.
• These are binary operations; that is, each is applied to two sets (of
tuples).
Unary and Binary relational
operations
• When these operations are adapted to relational databases,
the two relations on which any of these three operations are
applied must have the same type of tuples; this condition
has been called union compatibility or type compatibility.
• Two relations R(A1, A2, … , An) and S(B1, B2, … , Bn) are
said to be union compatible (or type compatible) if they
have the same degree n and if dom(Ai) = dom(Bi) for 1 ≤ i
≤ n.
• This means that the two relations have the same number of
attributes and each corresponding pair of attributes has the
same domain.
Unary and Binary relational
operations
• For example, to retrieve the Social Security numbers of all
employees who either work in department 5 or directly
supervise an employee who works in department 5.
• DEP5_EMPS ← σDno=5(EMPLOYEE)
• RESULT1 ← πSsn(DEP5_EMPS)
• RESULT2(Ssn) ← πSuper_ssn(DEP5_EMPS)
• RESULT ← RESULT1 ∪ RESULT2
• Both UNION and INTERSECTION are commutative operations;
that is,
R ∪ S = S ∪ R and R ∩ S = S ∩ R
Unary and Binary relational
operations
• Both UNION and INTERSECTION can be treated as n-ary
operations applicable to any number of relations
because both are also associative operations; that is,
R ∪ (S ∪ T ) = (R ∪ S) ∪ T and (R ∩ S) ∩ T = R ∩ (S ∩ T)
• The MINUS operation is not commutative; that is, in
general,
R−S≠S−R
• The INTERSECTION can be expressed in terms of union
and set difference as follows: R ∩ S = ((R ∪ S) − (R −
S)) − (S − R)
Unary and Binary relational
operations
The CARTESIAN PRODUCT (CROSS PRODUCT) Operation
• CARTESIAN PRODUCT operation—also known as CROSS
PRODUCT or CROSS JOIN— which is denoted by ×.
• This is also a binary set operation, but the relations on
which it is applied do not have to be union compatible.
• This set operation produces a new element by combining
every member (tuple) from one relation (set) with every
member (tuple) from the other relation (set).
• In general, the result of R(A1, A2, ..., An) × S(B1, B2, ..., Bm)
is a relation Q with degree n + m attributes Q(A1, A2, ..., An,
B1, B2, ..., Bm), in that order.
Unary and Binary relational
operations
• The resulting relation Q has one tuple for each combination of
tuples—one from R and one from S.
• If R has m tuples and S has n tuples, then R × S will have m*n
tuples.
• Example, suppose that we want to retrieve a list of names of each
female employee’s dependents. We can do this as follows:
FEMALE_EMPS ← σSex=‘F’(EMPLOYEE)
EMPNAMES ← πFname, Lname, Ssn(FEMALE_EMPS)
EMP_DEPENDENTS ← EMPNAMES × DEPENDENT
ACTUAL_DEPENDENTS ← σSsn=Essn(EMP_DEPENDENTS)
RESULT ← πFname, Lname, Dependent_name(ACTUAL_DEPENDENTS)
Unary and Binary relational
operations
Binary Relational Operations: JOIN and DIVISION (The
JOIN Operation)
• The JOIN operation, denoted by ⨝, is used to combine related
tuples from two relations into single “longer” tuples.
• This operation is very important for any relational database
with more than a single relation because it allows us to
process relationships among relations.
• To illustrate JOIN, suppose that we want to retrieve the name
of the manager of each department, as follows:
DEPT_MGR ← DEPARTMENT ⨝ Mgr_ssn=Ssn EMPLOYEE
RESULT ← πDname, Lname, Fname(DEPT_MGR)
Unary and Binary relational
operations
• The general form of a JOIN operation on two relations R(A1, A2,
… , An) and S(B1, B2, … , Bm) is R ⨝ <join condition>S
• The result of the JOIN is a relation Q with n + m attributes
Q(A1, A2, … , An, B1, B2, … , Bm) in that order; Q has one
tuple for each combination of tuples—one from R and one from
S—whenever the combination satisfies the join condition.
• The main difference between CARTESIAN PRODUCT and JOIN
are, In JOIN, only combinations of tuples satisfying the join
condition appear in the result, whereas in the CARTESIAN
PRODUCT all combinations of tuples are included in the result.
Unary and Binary relational
operations
• The join condition is specified on attributes from the two relations
R and S and is evaluated for each combination of tuples.
• Each tuple combination for which the join condition evaluates to
TRUE is included in the resulting relation Q as a single combined
tuple.
• A general join condition is of the form
<condition>AND<condition> AND … AND<condition> where
each <condition> is of the form Ai θ Bj , Ai is an attribute of R, Bj
is an attribute of S, Ai and Bj have the same domain, and θ
(theta) is one of the comparison operators {=, <,>,< , ≥, ≠}.
• A JOIN operation with such a general join condition is called a
THETA JOIN.
Unary and Binary relational
operations
Variations of JOIN:
• The EQUIJOIN and NATURAL JOIN
• The most common use of JOIN involves join conditions with equality
comparisons only. Such a JOIN, where the only comparison operator used is
=, is called an EQUIJOIN.
• Notice that in the result of an EQUIJOIN we always have one or more pairs
of attributes that have identical values in every tuple.
• For example, the values of the attributes Mgr_ssn and Ssn are identical in
every tuple of DEPT_MGR (the EQUIJOIN result) because the equality join
condition specified on these two attributes requires the values to be
identical in every tuple in the result.
• Because one of each pair of attributes with identical values is superfluous,
a new operation called NATURAL JOIN—denoted by * was created to get rid
of the second (superfluous) attribute in an EQUIJOIN condition.
Unary and Binary relational
operations
• The standard definition of NATURAL JOIN requires that the two join
attributes (or each pair of join attributes) have the same name in both
relations. If this is not the case, a renaming operation is applied first.
• Suppose we want to combine each PROJECT tuple with the DEPARTMENT
tuple that controls the project. In the following example, first we rename
the Dnumber attribute of DEPARTMENT to Dnum— so that it has the same
name as the Dnum attribute in PROJECT—and then we apply NATURAL
JOIN:
PROJ_DEPT ← PROJECT * ρ(Dname, Dnum, Mgr_ssn, Mgr_start_date)
(DEPARTMENT)
• The same query can be done in two steps by creating an intermediate
table DEPT as follows:
DEPT ← ρ(Dname, Dnum, Mgr_ssn, Mgr_start_date)(DEPARTMENT)
PROJ_DEPT ← PROJECT * DEPT
Unary and Binary relational
operations
• The attribute Dnum is called the join attribute for the NATURAL
JOIN operation, because it is the only attribute with the same
name in both relations.
• In general, the join condition for NATURAL JOIN is constructed by
equating each pair of join attributes that have the same name
in the two relations and combining these conditions with AND.
• A single JOIN operation is used to combine data from two
relations so that related information can be presented in a
single table. These operations are also known as inner joins.
• A more general, but nonstandard definition for NATURAL JOIN is
Q ← R *(list1),(list2)S
Unary and Binary relational
operations
• In this case,<list1> specifies a list of i attributes from R,
and <list2>specifies a list of i attributes from S.
• The NATURAL JOIN or EQUIJOIN operation can also be
specified among multiple tables, leading to an n-way
join.
• For example, consider the following three-way join:
((PROJECT ⨝ Dnum=DnumberDEPARTMENT) ⨝ Mgr_ssn=SsnEMPLOYEE)
Unary and Binary relational
operations
A Complete Set of Relational Algebra Operations
• It has been shown that the set of relational algebra operations {σ, π,
∪, ρ, –, ×} is a complete set; that is, any of the other original
relational algebra operations can be expressed as a sequence of
operations from this set.
• For example, the INTERSECTION operation can be expressed by using
UNION and MINUS as follows: R ∩ S ≡ (R ∪ S) – ((R – S) ∪(S – R))
• JOIN operation can be specified as a CARTESIAN PRODUCT followed
by a SELECT operation: R ⨝ <condition>S ≡ σ <condition>(R × S)
• A NATURAL JOIN can be specified as a CARTESIAN PRODUCT preceded
by RENAME and followed by SELECT and PROJECT operations.
Unary and Binary relational
operations
The DIVISION Operation
• The DIVISION operation, denoted by ÷, is useful for a special
kind of query that sometimes occurs in database applications.
• Example is Retrieve the names of employees who work on all
the projects that ‘John Smith’ works on.
• To express this query using the DIVISION operation, proceed
as follows. First, retrieve the list of project numbers that ‘John
Smith’ works on in the intermediate relation SMITH_PNOS:
• SMITH ← σFname=‘John’ AND Lname=‘Smith’(EMPLOYEE)
• SMITH_PNOS ← πPno(WORKS_ON ⨝ Essn=Ssn SMITH)
Unary and Binary relational
operations
• Next, create a relation that includes a tuple whenever the
employee whose Ssn is Essn works on the project whose
number is Pno in the intermediate relation SSN_PNOS:
• SSN_PNOS ← πEssn, Pno(WORKS_ON)
• Finally, apply the DIVISION operation to the two relations,
which gives the desired employees’ Social Security
numbers:
• SSNS(Ssn) ← SSN_PNOS ÷ SMITH_PNOS
• RESULT ← πFname, Lname(SSNS * EMPLOYEE)
Unary and Binary relational
operations
• In general, the DIVISION operation is applied to two
relations
R(Z) ÷ S(X), where the attributes of R are a subset of
the attributes of S; that is, X ⊆ Z. Let Y be the set of
attributes of R that are not attributes of S.
• The DIVISION operation is defined for convenience for
dealing with queries that involve universal
quantification or the all condition.
Unary and Binary relational operations
Unary and Binary relational operations
Unary and Binary relational operations
Unary and Binary relational operations
Unary and Binary relational
operations
Notation for Query Trees
• A query tree is a tree data structure that corresponds to a
relational algebra expression.
• It represents the input relations of the query as leaf nodes of
the tree and represents the relational algebra operations as
internal nodes.
• It is known as a query evaluation tree or query execution
tree.
• It includes the relational algebra operations being executed
and is used as a possible data structure for the internal
representation of the query in an RDBMS.
• An execution of the query tree
consists of executing an internal
node operation whenever its
operands (represented by its child
nodes) are available, and then
Unary and replacing that internal node by the
relation that results from executing
Binary the operation.
relational • The execution terminates when the
root node is executed and produces
operations the result relation for the query.
Unary
and
Binary
relational
operation
s
Additional relational operations
(aggregate, grouping, etc.)
Generalized Projection
• The generalized projection operation extends the
projection operation by allowing functions of attributes to
be included in the projection list.
• The generalized form can be expressed as: πF1, F2, ..., Fn (R),
where F1, F2, … , Fn are functions over the attributes in
relation R and may involve arithmetic operations and
constant values.
• Helpful when developing reports where computed values
have to be produced in the columns of a query result.
Additional relational operations
(aggregate, grouping, etc.)
• Consider the relation EMPLOYEE (Ssn, Salary, Deduction,
Years_service)
• A report may be required to show
Net Salary = Salary – Deduction,
Bonus = 2000 * Years_service, and
Tax = 0.25 * Salary
• Then a generalized projection combined with renaming
may be used as follows:
• REPORT ← ρ(Ssn, Net_salary, Bonus, Tax)(πSsn, Salary – Deduction, 2000 * Years_service,
0.25 * Salary(EMPLOYEE))
Additional relational operations
(aggregate, grouping, etc.)
Aggregate Functions and Grouping
• Another type of request that cannot be expressed in the
basic relational algebra is to specify mathematical aggregate
functions on collections of values from the database.
• Examples of such functions include retrieving the average or
total salary of all employees or the total number of employee
tuples.
• Common functions applied to collections of numeric values
include SUM, AVERAGE, MAXIMUM, and MINIMUM.
• The COUNT function is used for counting tuples or values.
Additional relational operations
(aggregate, grouping, etc.)
• Grouping the tuples in a relation by the value of some of
their attributes and then applying an aggregate function
independently to each group.
• We can define an AGGREGATE FUNCTION operation, using
the symbol I (pronounced script F), to specify these types of
requests as follows: <grouping attributes> ℑ <function list> (R)
• where <grouping attributes> is a list of attributes of the
relation specified in R, and <function list> is a list of
(<function> <attribute>) pairs. In each such pair,
<function> is one of the allowed functions—such as SUM,
AVERAGE, MAXIMUM, MINIMUM, COUNT
Additional relational operations
(aggregate, grouping, etc.)
• To retrieve each department number, the number of
employees in the department, and their average salary,
while renaming the resulting attributes
• ρR(Dno, No_of_employees, Average_sal) (Dno ℑ COUNT Ssn, AVERAGE Salary
(EMPLOYEE))
• If no renaming is applied, then the attributes of the
resulting relation that correspond to the function list will
each be the concatenation of the function name with the
attribute name in the form <function>_<attribute>.
Additional relational operations (aggregate, grouping, etc.)
Additional relational operations
(aggregate, grouping, etc.)
• If no grouping attributes are specified, the functions are
applied to all the tuples in the relation, so the resulting
relation has a single tuple only.
Recursive Closure Operations
• Another type of operation that,
Additional in general, cannot be specified
in the basic original relational
relational algebra is recursive closure.
operations • This operation is applied to a
recursive relationship between
(aggregate tuples of the same type, such
, grouping, as the relationship between an
employee and a supervisor.
etc.) • This relationship is described by
the foreign key Super_ssn of the
EMPLOYEE relation.
Additional relational operations
(aggregate, grouping, etc.)
• An example of a recursive operation is to retrieve all
supervisees of an employee e at all levels—that is, all
employees e′ directly supervised by e, all employees e′
directly supervised by each employee e′, all employees e″′
directly supervised by each employee e″, and so on.
• For example, to specify the Ssns of all employees e′ directly
supervised—at level one—by the employee e whose name is
‘James Borg’.
BORG_SSN ← πSsn(σFname=‘James’ AND Lname=‘Borg’(EMPLOYEE))
SUPERVISION(Ssn1, Ssn2) ← πSsn,Super_ssn(EMPLOYEE)
RESULT1(Ssn) ← πSsn1(SUPERVISION Ssn2=SsnBORG_SSN)
Additional relational operations
(aggregate, grouping, etc.)
• To retrieve all employees supervised by Borg at level 2
—that is, all employees e″ supervised by some
employee e′ who is directly supervised by Borg—we can
apply another JOIN to the result of the first query, as
follows:
• RESULT2(Ssn) ← πSsn1(SUPERVISION ⨝ Ssn2=SsnRESULT1)
• To get both sets of employees supervised at levels 1
and 2 by ‘James Borg’, we can apply the UNION
operation to the two results, as follows: RESULT ←
RESULT2 ∪ RESULT1
Additional relational operations
(aggregate, grouping, etc.)
• We cannot specify a query such as “retrieve the
supervisees of ‘James Borg’ at all levels” without
utilizing a looping mechanism unless we know the
maximum number of levels.
• An operation called the transitive closure of relations
has been proposed to compute the recursive
relationship as far as the recursion proceeds.
Additional
relational
operations
(aggregate,
grouping,
etc.)
Additional relational operations
(aggregate, grouping, etc.)
OUTER JOIN Operations
• For a NATURAL JOIN operation R * S, only tuples from R that have
matching tuples in S—and vice versa—appear in the result. Hence, tuples
without a matching (or related) tuple are eliminated from the JOIN result.
• Tuples with NULL values in the join attributes are also eliminated. This
type of join, where tuples with no match are eliminated, is known as an
inner join.
• This amounts to the loss of information if the user wants the result of the
JOIN to include all the tuples in one or more of the component relations.
• A set of operations, called outer joins, were developed for the case where
the user wants to keep all the tuples in R, or all those in S, or all those in
both relations in the result of the JOIN, regardless of whether or not they
have matching tuples in the other relation.
Additional relational operations
(aggregate, grouping, etc.)
• This satisfies the need of queries in which tuples from
two tables are to be combined by matching
corresponding rows, but without losing any tuples for
lack of matching values.
• Suppose that we want a list of all employee names as
well as the name of the departments they manage if
they happen to manage a department; if they do not
manage one, we can indicate it with a NULL value.
Additional relational
operations (aggregate,
grouping, etc.)

Example of Left Outer Join

TEMP ← (EMPLOYEE ⟕Ssn=Mgr_ssnDEPARTMENT)


RESULT ← πFname, Minit, Lname, Dname(TEMP)
Additional relational operations
(aggregate, grouping, etc.)
• A similar operation, RIGHT OUTER JOIN, denoted by,
keeps every tuple in the second, or right, relation S in
the result of R S.
• A third operation, FULL OUTER JOIN, denoted by , keeps
all tuples in both the left and the right relations when no
matching tuples are found, padding them with NULL
values as needed.
Additional relational operations
(aggregate, grouping, etc.)
The OUTER UNION Operation
• The OUTER UNION operation was developed to take the union of
tuples from two relations that have some common attributes but
are not union (type) compatible.
• This operation will take the UNION of tuples in two relations R(X, Y)
and S(X, Z) that are partially compatible, meaning that only some
of their attributes, say X, are union compatible.
• The attributes that are union compatible are represented only once
in the result, and those attributes that are not union compatible
from either relation are also kept in the result relation T(X, Y, Z).
• It is therefore the same as a FULL OUTER JOIN on the common
attributes.
Examples of
Queries in
relational
algebra
Examples of
Queries in
relational
algebra

Retrieve the name


and address of all
employees who work
for the ‘Research’
department.
Examples of Queries in relational
algebra
Retrieve the name and address of all employees who work
for the ‘Research’ department.

RESEARCH_DEPT ← σDname=‘Research’(DEPARTMENT)

RESEARCH_EMPS ← (RESEARCH_DEPT ⨝ Dnumber=Dno EMPLOYEE)

RESULT ← πFname, Lname, Address(RESEARCH_EMPS)


As a single in-line expression, this query becomes:
πFname, Lname, Address (σDname=‘Research’(DEPARTMENT ⨝ Dnumber=Dno (EMPLOYEE))
Examples of
Queries in
relational
algebra

Find the names of


employees who work
on all the projects
controlled by
department number
5.
Examples of Queries in relational
algebra
Find the names of employees who work on all the projects
controlled by department number 5.

DEPT5_PROJS ← ρ(Pno)(πPnumber(σDnum=5(PROJECT)))

EMP_PROJ ← ρ(Ssn, Pno)(πEssn, Pno(WORKS_ON))

RESULT_EMP_SSNS ← EMP_PROJ ÷ DEPT5_PROJS

RESULT ← πLname, Fname(RESULT_EMP_SSNS * EMPLOYEE)


Examples of
Queries in
relational
algebra

List the names of all


employees with two
or more dependents
Examples of Queries in relational
algebra
List the names of all employees with two or more
dependents.

T1(Ssn, No_of_dependents)← Essn ℑ COUNT Dependent_name (DEPENDENT)

T2 ← σNo_of_dependents>2(T1)

RESULT ← πLname, Fname(T2 * EMPLOYEE)


Examples of
Queries in
relational
algebra

Retrieve the names of


employees who have
no dependents.
Examples of Queries in relational
algebra
Retrieve the names of employees who have no dependents.

ALL_EMPS ← πSsn(EMPLOYEE)
EMPS_WITH_DEPS(Ssn) ← πEssn(DEPENDENT)
EMPS_WITHOUT_DEPS ← (ALL_EMPS – EMPS_WITH_DEPS)
RESULT ← πLname, Fname(EMPS_WITHOUT_DEPS * EMPLOYEE)
Module 2 – Chapter
3
Relational Database Design using ER-to-
Relational mapping

Step 1: Mapping of Regular Entity Types


• For each regular (strong) entity type E in the ER schema,
create a relation R that includes all the simple attributes of E.
• Include only the simple component attributes of a composite
attribute.
• Choose one of the key attributes of E as the primary key for
R. If the chosen key of E is a composite, then the set of
simple attributes that form it will together form the primary
key of R.
Relational Database
Design using ER-to-
Relational mapping

• If multiple keys were identified for E


during the conceptual design, the
information describing the attributes that
form each additional key is kept to
specify additional (unique) keys of
relation R.
• Knowledge about keys is also kept for
indexing purposes and other types of
analyses.
Relational Database Design using ER-to-
Relational mapping

Step 2: Mapping of Weak Entity Types


• For each weak entity type W in the ER schema with owner
entity type E, create a relation R, and include all simple
attributes (or simple components of composite attributes) of
W as attributes.
• In addition, include as foreign key attributes of R the primary
key attribute(s) of the relation(s) that correspond to the
owner entity type(s).
• The primary key of R is the combination of the primary
key(s) of the owner(s) and the partial key of the weak entity
type W, if any.
Relational
Database
Design using
ER-to-Relational
mapping
Relational Database Design using ER-to-
Relational mapping

Step 3: Mapping of Binary 1:1 Relationship Types


• For each binary 1:1 relationship type R in the ER
schema, identify the relations S and T that correspond
to the entity types participating in R.
• There are three possible approaches:
(1) the foreign key approach
(2) the merged relationship approach
(3) the cross reference or relationship relation
approach
Relational Database Design using ER-to-
Relational mapping

Foreign key approach:


• Choose one of the relations—S, and include as a foreign key
in S the primary key of T. It is better to choose an entity type
with total participation in R in the role of S.
• We map the 1:1 relationship type MANAGES by choosing the
participating entity type DEPARTMENT to serve in the role of S
because its participation in the MANAGES relationship type is
total (every department has a manager).
• We include the primary key of the EMPLOYEE relation as
foreign key in the DEPARTMENT relation and rename it to
Mgr_ssn.
Relational Database Design
using ER-to-Relational mapping
Relational Database Design using ER-to-
Relational mapping

Merged relation approach:


• An alternative mapping of a 1:1 relationship type is to
merge the two entity types and the relationship into a
single relation.
• This is possible when both participations are total, as
this would indicate that the two tables will have the
exact same number of tuples at all times.
Relational Database Design using ER-to-
Relational mapping

Cross-reference or relationship relation approach:


• The third option is to set up a third relation R for the
purpose of cross-referencing the primary keys of the
two relations S and T representing the entity types.
• This approach is required for binary M:N relationships.
The relation R is called a relationship relation (lookup
table).
Relational Database Design using ER-to-
Relational mapping

Step 4: Mapping of Binary 1:N Relationship Types


The foreign key approach:
• For each regular binary 1:N relationship type R, identify the
relation S that represents the participating entity type at
the N-side of the relationship type.
• Include as foreign key in S the primary key of the relation T
that represents the other entity type participating in R.
• For WORKS_FOR we include the primary key Dnumber of
the DEPARTMENT relation as foreign key in the EMPLOYEE
relation and call it Dno.
Relational Database Design using ER-to-Relational
mapping
Relational Database Design using ER-to-
Relational mapping

The relationship relation approach:


• An alternative approach is to use the relationship relation
(cross-reference) option as in the third option for binary
1:1 relationships.
• We create a separate relation R whose attributes are the
primary keys of S and T, which will also be foreign keys to
S and T.
• This is only recommended when few relationship instances
exist, in order to avoid NULL values in foreign keys.
Relational Database Design using ER-to-
Relational mapping

Step 5: Mapping of Binary M:N Relationship Types


• In the traditional relational model with no multivalued
attributes, the only option for M:N relationships is the
relationship relation (cross-reference) option.
• For each binary M:N relationship type R, create a new
relation S to represent R.
• Include as foreign key attributes in S the primary keys of
the relations that represent the participating entity types;
their combination will form the primary key of S.
Relational Database Design using ER-to-
Relational mapping

• Map the M:N relationship type WORKS_ON by creating


the relation WORKS_ON.
• We include the primary keys of the PROJECT and
EMPLOYEE relations as foreign keys in WORKS_ON and
rename them Pno and Essn, respectively.
• The primary key of the WORKS_ON relation is the
combination of the foreign key attributes {Essn, Pno}.
Relational Database Design
using ER-to-Relational mapping
Relational Database Design using ER-to-
Relational mapping

Step 6: Mapping of Multivalued Attributes


• For each multivalued attribute A, create a new relation
R.
• This relation R will include an attribute corresponding to
A, plus the primary key attribute K—as a foreign key in
R—of the relation that represents the entity type or
relationship type that has A as a multivalued attribute.
• The primary key of R is the combination of A and K.
Relational Database Design using ER-to-
Relational mapping

• We create a relation DEPT_LOCATIONS.


• The attribute Dlocation represents the multivalued
attribute LOCATIONS of DEPARTMENT, whereas
Dnumber—as foreign key—represents the primary key
of the DEPARTMENT relation.
• The primary key of DEPT_LOCATIONS is the
combination of {Dnumber, Dlocation}.
Relational
Database Design
using ER-to-
Relational mapping
Relational Database Design using ER-to-
Relational mapping

Step 7: Mapping of N-ary Relationship Types


• For each n-ary relationship type R, where n > 2, create
a new relationship relation S to represent R.
• Include as foreign key attributes in S the primary keys
of the relations that represent the participating entity
types.
• The primary key of S is usually a combination of all the
foreign keys that reference the relations representing
the participating entity types.
Relational Database
Design using ER-to-
Relational mapping

• Ternary relationship type


SUPPLY, which relates a
SUPPLIER s, PART p, and
PROJECT j.
• Primary key is the
combination of the three
foreign keys {Sname,
Part_no, Proj_name}.
End of Module 2

You might also like