0% found this document useful (0 votes)
89 views147 pages

Relational Model and Relational Algebra

The relational model was introduced by Ted Codd in 1970 and uses mathematical relations (tables) as its basic building block. It is based on set theory and first-order predicate logic. The model represents a database as a collection of relations, where each relation is a table composed of rows (tuples) and columns (attributes). Attributes have domains which define their possible data types and values. Relation schemas define the structure of relations, and relation states hold the actual data values in tuples. Relationships between relations are represented using constraints.

Uploaded by

soorajkumar2828
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views147 pages

Relational Model and Relational Algebra

The relational model was introduced by Ted Codd in 1970 and uses mathematical relations (tables) as its basic building block. It is based on set theory and first-order predicate logic. The model represents a database as a collection of relations, where each relation is a table composed of rows (tuples) and columns (attributes). Attributes have domains which define their possible data types and values. Relation schemas define the structure of relations, and relation states hold the actual data values in tuples. Relationships between relations are represented using constraints.

Uploaded by

soorajkumar2828
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 147

Relational Model

and Relational
Algebra
Relational Model

•The relational model was first introduced by Ted Codd

of IBM in1970

•The model uses the concepts of a mathematical relation

(table of values) as its basic building block.

•The model has its theoretical basis

• set theory

• first –order predicate logic.


Relational Model

•The model has been implemented in a large number of commercial

DBMS.
1. Oracle DBMS and SQL /DS DBMS (by IBM)
2. DB2 and Informix Dynamic Server (IBM)
3. Oracle and Rdb (Oracle)
4. SQL Server and Access(Microsoft)

• Our discussion is divided into 2 parts :

• Relational model concepts

• Operations on relational model using relational algebra


Topics to be discussed
1. Relational Model Concepts

2. Relational Model Constraints and Relational Database

Schema

3. Update operations and Dealing with Constraint Violations

4. Unary Relational Operations (SELECT and PROJECT)

5. Relational operations from Set theory

6. Binary Relational Operations

7. Examples of Queries in Relational operations.

8. Relational Database Using ER-to -Relational Mapping


Relational Model
Concepts
Relational model terminology

• The relational model represents the database as a collection of


relations.

– a row is called a tuple,

– a column header is called an attribute

– the table is called a relation.

– The data type describing the types of values that can appear in
each column is called a domain
Relational model terminology
Domains

• A domain D is a set of atomic values. A domain has name, data type


and format.

Examples for domains

1. Phone_numbers : A set of 10 digit numbers


2. Social_securiry_numbers: The set of valid nine-digit social
security numbers

3. Employee_ages: set of integers between 16 and 60

4. Academic_Department_names: The set of academic department

names in a university, such as Computer Science, Economics, and


Physics.
Domains

• A domain has a name ,a data type and a format

Examples : Phone_numbers
Data type : a character string
format : dddd-dddddd
where each d is a digit

Employee_ages ( Data type : Integer)

Academic_Department_names
data type : a set of Character String
Relation Schema and Attributes

• A relation schema R, denoted by R(A1, A2, . . ., An), is made


up of a relation name R and a list of attributes A1, A2, . . ., An.

• Each attribute Ai is the name of a role played by some domain


D in the relation schema R. D is called the domain of Ai and is
denoted by dom(Ai).

• The degree (arity) of a relation is the number of attributes (n)


of its relation schema.
An example : University students

STUDENT(Name, SSN, HomePhone, Address, OfficePhone, Age, GPA)

Relation Name : STUDENT

Attributes : Name, SSN, HomePhone, Address, OfficePhone, Age, GPA

Domains:
• dom(Name) = set of all Names;
• dom(SSN) = Social_security_numbers;
• dom(HomePhone) = Local_phone_numbers,
• dom(HomePhone) = Addresses
• dom(OfficePhone) = Local_phone_numbers,
• dom(GPA) = Grade_point_averages.

Degree of the relation : 7


Relation state

• A relation state r of the relation schema R(A1, A2, . . ., An), denoted


by r(R), is a set of n-tuples r = {t1, t2, . . ., tm}.
• Each n-tuple t is an ordered list of n values t = <v1, v2, . . ., vn>,
where each value vi, 1<=i<= n, is an element of dom(Ai) or is a special
null value.

A relation state r of student relation schema R


Relation state

• The ith value in tuple t, which corresponds to the attribute A i, is referred

to as t[Ai].

• The terms relation intension for the schema R and relation


extension for a relation state r(R) are also commonly used.

More Formally a relation state can be restated as


follows:
A relation (relation state) r(R) is a mathematical relation of degree n on
the domains dom(A1) ,dom(A2) …dom(An) , which is a subset of
Cartesian Product of the domains that define R.
Characteristics of Relations
1. Ordering of Tuples in a Relation
• A relation is a set of tuples. Hence , tuples in a
relation do not have any particular order.

• Tuple ordering is not part of a relation definition. But


many logical ordering can be specified on a relation.

• When a relation is implemented as a file or displayed


as table , a particular ordering may be specified on
the records of the file or on the rows of a table .
3.2 Characteristics of Relations
1. Ordering of Tuples in a Relation

The above relations are identical ( i.e tuple ordering does not matter)
Characteristics of Relations

2. Ordering of Values within a Tuple


• An n-tuple is an ordered list of n values. Hence the
ordering of values of an n-tuple is important.

• However at a logical level ,the order of attribute and


their values is not important as long as the
correspondence between the attributes and values is
maintained.
Characteristics of Relations

The order of attribute and their values is not important as long

as the correspondence between the attributes and values is

maintained.

The following two tuples are identical


Characteristics of Relations

3. Values and nulls in the Tuples


• In relation each value in a tuple is an atomic value;
(indivisible)

• Hence composite and multi valued attributes are not


allowed. That is why this model is called flat relational
model.

• Multivalued attributes must be represented in a separate


relation.

• Composite attributes are represented only by their


simple component. (e.g. fname)
Characteristics of Relations
3. Values and Nulls in the Tuples
• The values of some attributes within a particular tuple may be
unknown or may not apply to that tuple. A special value, called null
can be used in this case.

• Some student do not have office , so officephone number is null (not applicable)
• Some students do not have homephone number or have one but we do not know, so
homephone number is null (unknown),
Characteristics of Relations
4. Interpretation (Meaning) of a Relation :

a) The relation schema can be interpreted as a declaration or a

type of assertion. Each tuple in the relation can then be


interpreted as a fact or a particular instance of the assertion.
Characteristics of Relations
4. .
5. Interpretation (Meaning) of a Relation :

– Relations may represent facts about entities, whereas other


relations may represent facts about relationships .

For example: A relation schema MAJORS (StudentSSN, DepartmentCode) asserts

that students major in academic departments; a tuple in this relation relates a student to

his or her major department.

b. Alternative interpretation of a relation schema is as a


predicate. In this case values in each tuple are
interpreted as values that satisfy the predicate
Relational Model Notation :

• The letters Q, R, S denote relation names.

• The letters q, r, s denote relation states.

• The letters t, u, v denote tuples.

• A relation schema R of degree n is denoted by R(A1, A2, . . .,

An).

Where A1,A2…An are the attributes of R


Relational Model Notation :

• An n-tuple t in a relation r(R) is denoted by t = <v1, v2, . . ., vn> where vi is the

value corresponding to attribute Ai. The following notation refers to component

values of tuples:

– Both t[Ai] and t.Ai refer to the value vi in t for attribute Ai.

– Both t[Au, Aw, . . ., Az] and t.(Au, Aw, . . ., Az), where Au, Aw, . . ., Az is a list

of attributes from R, refer to the subtuple of values <vu, vw, . . ., vz> from t

corresponding to the attributes specified in the list.


Relational Model Notation

• An attribute A can be qualified with the relation


name R to which it belongs by using the dot notation
R.A

for example : STUDENT.Name or STUDENT.Age.

This is because the same name may be used for two


attributes in different relations.
Relational Model Notation : Example

Relation Schema :
STUDENT (name,SSN,HoemPhone,Address,OfficePhone,Age,GPA)
Relation State r(STUDENT)

t4 = < Charles Cooper,489-22-1100,376-9821,265 Lark Lane , 749-6492,28,3.92>


t4[SSN] or t4.SSN = 489-22-1100
t3[Age] or t3.Age = 25
t1[Name ] or t1.Name = Benjamin Bayer
Relational Model
Constraints and
Relational database
schemas.
Relational Model Constraints
• Constraints generally specify the restriction on the actual values in a
database.
• Constraints on databases can generally be divided into 3 main categories

1. Inherent Model based constraints

2. Schema - based constraints ( we discuss only this)

3. Application - based constraints

Data dependency constraints :


1. Functional dependencies
2. Multi valued dependencies
Relational Model Constraints
1. Inherent Model based constraints :Constraints that are
inherent in the data model

2. Schema - based constraints : Constraints that can be


directly expressed in the schemas of the data model
 Includes :
1. Domain Constraints
2. Key Constraints and Constraints on Null Values
3. Entity Integrity, Referential Integrity constraints

3. Application-based constraints : Constraints that cannot be


directly expressed in the schemas of the data model
1. Domain constraints :

• Domain constraints specify that within each tuple ,the value of

each attribute A must be an atomic value from the domain

dom(A).

• The data types associated with domains include :


1. standard numeric data types for integers (short-integer,
integer, long-integer )
2. real numbers (float and double-precision float).
3. Characters, fixed-length strings, and variable-length strings.
4. date, time, timestamp, and sometimes money data types.
5. enumerated data type
2. Key Constraints and Constraints on Null

Superkey: A superkey SK specifies a uniqueness constraint that no two


distinct tuples in any state r of R can have the same combination of
values for all the attributes in SK.

Formally,
Consider R {A,B,C,D}
Let r(R) be a relation state of R
Let SK denote any one of the subsets of R , say SK = {A,C}.

Then SK is Super Key of R if, for any distinct tuples t1 and t2 in r ,

t1[SK]  t2[SK]. ( t1[A,C]  t2[A,C] )

Every relation has at least one default superkey -the set of all its attributes.
2. Key Constraints and Constraints on Null

Superkey: Example

The above is a relation state of STUEDNT relation. i.e. r(STUDENT)

The subset SK = {SSN, name , Age) , is a superkey of STUDENT, because for any
two distinct tuples t1,t2, in r(STUDENT) ,
t1[SK]  t2[SK].
2. Key Constraints and Constraints on Null

Key : A key K of a relation schema R is a superkey of R with the

additional property that removing any attribute A from K leaves a

set of attributes K’ that is not a superkey of R.

{SSN} is a key of STUDENT ( also a Super Key of STUDENT)


but {SSN , name ,age} not a key . Why?
2. Key Constraints and Constraints on Null

Hence, a key satisfies two constraints:

1. Two distinct tuples in any state of the relation cannot have


identical values for (all) the attributes in the key.

2. It is a minimal superkey-that is, a superkey from which we


cannot remove any attributes and still have the uniqueness
constraint in condition 1 hold.

A key is determined from the meaning of the attributes, and the


property is time-invariant.
2. Key Constraints and Constraints on Null

supper key examples

SuperKeys of CAR:
1.{ LN, ESN, Make, Model, Year} (not a key)
2. {LN, Year} ( Not a key)
3. {ESN, Model, Year} ( Not a key)
4. {ESN} ( Also a Key)

Is Year a Super key of CAR?


2. Key Constraints and Constraints on Null

Key examples
Keys of Car relation :

Key1 = {LN}
Key2 = {ESN}, which are also superkeys of CAR.

BUT {LN, Make} is not a key , but is a superkey

• A relation schema may have more than one key. In this case, each of the
keys is called a candidate key (LN and ESN are candidate keys).

• If a relation has several candidate keys, one is chosen arbitrarily to be the


primary key. The primary key attributes underlined.
2. Key Constraints and Constraints on Null

Constraints on Null :
Specifies whether null values are or are not permitted.

For example, if every STUDENT tuple must have a valid, non-null value
for the Name attribute, then Name of STUDENT is constrained to be
NOT NULL.
Relational Databases and Relational Database Schemas

Relational Database Schema : A relational database schema S is a set


of relation schemas S = {R1, R2, . . ., Rm} and a set of integrity
constraints IC.

Schema diagram for the


COMPANY relational database
schema
Relational Databases and Relational Database Schemas

Relational database state : A relational database state,DB,


of S is a set of relation states DB = {r1, r2, . . ., rm} such

that each ri is a state of Ri satisfies the integrity constraints


specified in IC.

Invalid database state : A database state that does not obey


all the integrity constraints is called an invalid state,

Valid database state : A state that satisfies all the constraints in


IC is called a valid state.
Relational database state : example

One possible
database
state for the
COMPANY
database
schema
Relational Databases and Relational Database Schemas

Note the following :

1. Attributes that represent the same real-world concept may or may


not have identical names in different relations.
E.g. DNO in EMPLOYEE and DNUM in PROJECT .

2. Attributes that represent different concepts may have the same


name in different relations.
E.g. use NAME for PNAME of PROJECT and DNAME of DEPARTNAME.

3. If there are identical attribute names in real-world concept, then

these attributes must have different names.


3. Entity Integrity, Referential Integrity

Entity integrity : The entity integrity constraint states that


no primary key value can be null.
• Key and entity integrity constraints are specified on individual relations.

Referential integrity : The referential integrity constraint states


that a tuple in one relation that refers to another relation must refer to an
existing tuple in that relation.
• Referential integrity is specified between two relations
• It is used to maintain the consistency among the tuples in two relations
Referential Integrity can be defines more formally using Foreign
Keys

Foreign keys :A set of attributes FK in relation schema R1 is a foreign


key of R1 that references relation R2 if it satisfies the following two rules:

1. The attributes in FK have the same domain(s) as the primary key


attributes PK of R2; the attributes FK are said to reference or refer
to the relation R2.

2. Let t1 ϵ r(R1) and t2 ϵ r(R2) , then


either t1 [FK] = t2[PK] or t1[FK] = NULL

R1 is called the referencing relation and R2 is the referenced


relation.
A foreign key can refer to its own relation.
e.g. : SUPERSSN of EMPLOYEE refers to SSN of EMPLOYEE
Displaying Referential integrity constraints

Referential integrity constraints displayed on the COMPANY relational database schema.


Other types of constraint

1. Semantic integrity constraints:

• Include a large class of general constraints

E.g. : Salary of an employ should not exceed the salary of the

employee’s supervisor.

• Such constraints can be specified and enforced by


• the application programs or

• constraint specification language (triggers and assertions)


Other types of constraint

2. Functional Dependency :
• Establishes a functional relationship among two sets of attributes
X and Y.
• It specifies that the value of X determines value of Y in all states
of a relation. It is denoted by X  Y

E.g: In a STUDENT relation, Regno determines values of every other


attributes in the relation.

3. Transition Constraints : It deals with changes in the database and


enforced by application programs
E.g : The salary of an employ can only increase.
3.4 Update Operations and Dealing with Constraint
Violations

The operations of the relational model can be categorized into retrievals


and updates.
•Update Operations:
1. The Insert Operation
2. The Delete Operation
3. The Update Operation

• Insert is used to insert a new tuple or tuples in a relation


• Delete is used to delete tuples
• Update ( Modify) is used to change the values of some attributes in
existing tuples
Example Database state
3.4 Update Operations and Dealing with Constraint
Violations

The Insert Operation

•The Insert operation provides a list of attribute values for a new tuple t
that is to be inserted into a relation R.

•Insert can violate any of the four types of constraints


1. Domain constraints.
2. Key constraints
3. Entity integrity
4. Referential integrity
3.4 Update Operations and Dealing with Constraint Violations

The Insert Operation : Examples of violations

1. Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, null, ‘1960-04-05’, ‘6357 Windy Lane, Katy, TX’, F, 28000,
null, 4> into EMPLOYEE
2. Insert <‘Alicia’, ‘J’, ‘Zelaya’, ‘999887777’, ‘1960-04-05’, ‘6357 Windy Lane, Katy, TX’, F,
28000, ‘987654321’, 4> into EMPLOYEE..
3. Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘677678989’, ‘1960-04-05’, ‘6357 Windswept, Katy, TX’, F,
28000, ‘987654321’, 7> into EMPLOYEE.
4. Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘677678989’, ‘1960-04-05’, ‘6357 Windy Lane, Katy, TX’, F,
28000, null, 4> into EMPLOYEE.
3.4 Update Operations and Dealing with Constraint Violations

The Insert Operation :

What DBMS will do when Insert operation violates constraints ?

There are several options for DBMS when insert operations causes violations.
1. Reject the insertion.
2. Attempt to correct reason for rejecting the insertion.
3.4 Update Operations and Dealing with Constraint Violations

The Delete Operation


The Delete operation can violate only referential integrity, if the tuple being
deleted is referenced by the foreign keys from other tuples in the database.
3.4 Update Operations and Dealing with Constraint Violations

The Delete Operation :Examples of violations

Are the following delete operations acceptable ?

1. Delete the WORKS_ON tuple with ESSN = ‘999887777’

and PNO = 10.

2. Delete the EMPLOYEE tuple with SSN = ‘999887777’.

3. Delete the EMPLOYEE tuple with SSN = ‘333445555’.


3.4 Update Operations and Dealing with Constraint Violations

The Delete Operation

What DBMS will do when the Delete Operation violates referential


integrity constraints ?

There are several options for DBMS.


1. Reject the deletion.
2. Cascade the deletion.
3. Modify the referencing attribute values that causes the violation.
4. Combination of all 3 options.
3.4 Update Operations and Dealing with Constraint Violations

The Update Operation


•The Update operation is used to change the values of one or more attributes in a
tuple (or tuples) of some relation R. It is necessary to specify a condition on the

attributes of the relation to select the tuple (or tuples) to be modified.

• Updating an attribute that is neither a primary key nor a foreign key


usually causes no problems.
•Modifying a primary key value is similar to deleting one tuple and
inserting another in its place.
•If a foreign key attribute is modified, the DBMS must make sure that
the new value refers to an existing tuple in the referenced relation .
3.4 Update Operations and Dealing with Constraint Violations

The Update Operation: Examples of violations


3.4 Update Operations and Dealing with Constraint Violations

The Update Operation : Examples of updates and violations

Are the following Update operations acceptable ?

1. Update the SALARY of the EMPLOYEE tuple with SSN = ‘999887777’ to 28000.
2. Update the DNO of the EMPLOYEE tuple with SSN = ‘999887777’ to 1.
3. Update the DNO of the EMPLOYEE tuple with SSN = ‘999887777’ to 7
4. Update the SSN of the EMPLOYEE tuple with SSN = ‘999887777’ to
‘987654321’.
The Relational
Algebra
The Relational Algebra

• The basic set of operations for relational model is the relational algebra
• The relational algebra operations are usually divided into two groups.
• One group includes set operations from mathematical set theory.
1. UNION
2. INTERSECTION
3. SET DIFFERENCE
4. CARTESIAN PRODUCT

• The other group consists of operations developed specifically for relational


databases.
1. SELECT (Unary operation)
2. PROJECT (Unary operation)
3. JOIN (Binary operation)
SELECT

• The SELECT operation is used to select a subset of the tuples from a

relation that satisfy a selection condition.

• The SELECT operation can also be visualized as a horizontal

partition of the relation into two sets of tuples

1. those tuples that satisfy the condition and are selected

2. those tuples that do not satisfy the condition and are discarded.

• Denoted by 
SELECT
• General format:
 <selection condition>(R)
Where
• the symbol  (sigma) is used to denote the SELECT operator.
• selection condition is a Boolean expression specified on the
attributes of relation R.
• R is generally a relational algebra expression whose result is
a relation.

• The Relation resulting from the SELECT operation has


the same attributes as R
SELECT

Examples :
1. select the EMPLOYEE tuples whose department is 4,
 DNO=4(EMPLOYEE)

2. select the EMPLOYEE tuples whose salary is greater


than 30,000,

 SALARY>30000(EMPLOYEE)
SELECT

• The Boolean expression specified in <selection condition> is


made up of a number of clauses of the form

1. <attribute name> <comparison operator> <constant value>


or

2. <attribute name> <comparison operator> < attribute name >

• <attribute name> is the name an attribute of R

• Comparison operator is one of the operators {=,<,>,≤,≥,≠}


SELECT

• Clauses can be connected arbitrarily using Boolean operators AND,


OR and NOT

For example: Select all employees who either work in department


4 and make over 25,000 per year, or work in department 5 and
make over 30,000.

 (DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000) (EMPLOYEE)


SELECT

Employee Relation

Results :
SELECT

• The Boolean conditions AND, OR, and NOT have their normal
interpretation as follows:

• (cond1 AND cond2) is true if both (cond1) and (cond2) are


true; otherwise, it is false.

• (cond1 OR cond2) is true if either (cond1) or (cond2) or both


are true; otherwise, it is false.

• (NOT cond) is true if cond is false; otherwise, it is false.


SELECT

Note the following

• The number of tuples in the resulting relation is always less than or


equal to the number of tuples in R.
• The degree of the relation resulting from a SELECT operation is the
same as that of R.

• The SELECT operation is commutative; that is:


s <cond1>( <cond2>(R)) =  <cond2>( <cond1>(R))

• A cascade of select operation can be combined into one SELECT


operation using AND condition
 <cond1>( <cond2>(…( <condn>(R))…)) =  <cond1>AND <cond2>…AND <condn> (R)
PROJECT

• The PROJECT operation selects certain columns from the table and
discards the other columns.
• Can be visualized as a vertical partition of the relation into two
relations:
1. one has the needed columns and contains the result of the
operation,
2. and the other contains the discarded columns.

• Denoted by π

• General form :
π <attribute list>(R)
PROJECT
E.g. :
list each employee's first and last name and salary,
π LNAME, FNAME, SALARY (EMPLOYEE)

Result
PROJECT

• The PROJECT operation removes any duplicate tuples (a valid


relation )
List SEX and SALARY of Employees

π SEX, SALARY(EMPLOYEE)
PROJECT
Note the Following :

• The number of tuples in a relation resulting from a PROJECT operation


is always less than or equal to the number of tuples in R.
• If the projection list is a superkey of R then resulting relation has the
same number of tuples as R

• π <list1> (π <list2>(R)) = π <list1>(R) as long as <list2> contains the


attributes in <list1>

E.g : π NAME,SEX ( π NAME,SEX,SALARY (R) ) = π NAME,SEX (R)


Sequence of Relational Operations

• We can write relational algebra expression in ways


1. A single relational algebra expression by nesting the operations
2. we divide the above operation into a sequence of simple operations to
produce intermediate results.

E.g. : List LNAME, FNAME, SALARY of all employees whose department no =5

π LNAME, FNAME, SALARY ( DNO = 5 (EMPLOYEE))

Result of both the operations


OR
The RENAME operation
: Used to rename attributes of a relation.
Example :

TEMP   DNO = 5 (EMPLOYEE)

R(FISRTNAME, LASTNAME,SALARY)  π FNAME, LNAME, SALARY(TEMP)

LNAME, FNAME, SALARY are renamed as FISRTNAME, LASTNAME,SALARY


The RENAME operation

• We can also define a formal RENAME operation-which can


rename either the relation name or the attribute names,
or both

• The general RENAME operation when applied to a relation


R of degree n is denoted by

S new relation name


Set Theoretic Operations
• UNION: The result of this operation, denoted by R U S, is a relation that includes all
tuples that are either in R or in S or in both R and S. Duplicate tuples are eliminated.

• INTERSECTION: The result of this operation, denoted by R ∩ S, is a relation that


includes all tuples that are in both R and S.

• SET DIFFERENCE: The result of this operation, denoted by R - S, is a relation that


includes all tuples that are in R but not in S.

• Union (Type) compatibility :

Two relations R(A1, A2, . . ., An) and S(B1, B2, . . ., Bn) are said to
be union compatible if they have the same degree n, and if
dom(Ai) = dom(Bi) for 1 ≤ i ≤ n.
Set Theoretic Operations
Example
Set Theoretic Operations
Note the following

• Both UNION and INTERSECTION are commutative operations; that is

R  S = S  R, and R  S = S  R
• Both union and intersection can be treated as n-ary operations applicable to

any number of relations because both are associative operations; that is

R  (S  T) = (R  S)  T, and (R  S)  T = R  (S  T)

• The minus operation is not commutative; that is, in general

R-S≠S–R
CARTESIAN (or cross product) Operation

• This operation is used to combine tuples from two relations in a combinatorial


fashion.

• In general, the result of R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm) is a relation

Q with degree n + m attributes Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that

order. The resulting relation Q has one tuple for each combination of tuples
—one from R and one from S.

• Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS


tuples, then

| R x S | will have nR * nS tuples.

• The two relations do NOT have to be "type compatible”


CARTESIAN (or cross product) Operation

• Example for Cross product


CARTESIAN (or cross product) Operation
• Example for Cross product

EMP_DEPENDENTS  EMPNAMES x DEPENDENT


CARTESIAN (cross product) Operation

• The Cross product is useful when followed by a selection that matches the

values of attributes coming from the component relations


For Example : We want to retrieve a list of female employee’s dependents
CARTESIAN (cross product) Operation

Example : Retrieving a list of female employee’s dependents

1. FEMALE_EMPS   SEX=’F’(EMPLOYEE)

2. EMPNAMES   FNAME, LNAME, SSN (FEMALE_EMPS)


CARTESIAN (cross product) Operation

3. EMP_DEPENDENTS  EMPNAMES x DEPENDENT


CARTESIAN (cross product) Operation

4. ACTUAL_DEPENDENTS   SSN=ESSN (EMP_DEPENDENTS )

5. RESULT   FNAME, LNAME, DEPEDENT_NAME(ACTUAL_DEPENDENTS)


JOIN Operation

• The sequence of Cartesian product followed by a SELECT is used


quite commonly to identify and select related tuples from two
relations, a special operation, called JOIN.

• This operation is very important for any relational database with


more than a single relation, because it allows us to process
relationships among relations.

• Join is denoted by
JOIN Operation

The General Form

• A join operation on two relations R(A1, A2, . . ., An) and

S(B1, B2, . . ., Bm) is:

R <join condition> S

• R and S can be any relations that result from general


relational algebra expressions.
• The relation will have (n + m) attributes
• A1,A2,…An,B1,B2…Bm are the attributes of the resultant
relation in that order
JOIN Operation

Example: Retrieve the name of the manager of each department.


To get the manager’s name, we need to combine each DEPARTMENT tuple with the
EMPLOYEE tuple whose SSN value matches the MGRSSN value in the department
tuple. We do this by using the join operation.

DEPT_MGR  DEPARTMENT MGRSSN=SSN


EMPLOYEE

Note : The above operation is equivalent to Cartesian product of


DEPARTMENT and EMPLOYEE followed by SELECT operation on the resultant
relation , with the condition MGRSSN=SSN
EMP_MANG EMPLOYEE x DEPARTMENT
DEPT_MGR   SSN=MGRSSN ( EMP_MANG )
JOIN Operation

JOIN Operation : DEPT_MGR  DEPARTMENT MGRSSN=SSN


EMPLOYEE
JOIN Operation
Note the following

• In JOIN, only combinations of tuples satisfying the join condition


appear in the result .
• Whereas in the CARTESIAN PRODUCT all combinations of tuples are
included in the result

• THETA join :
<condition> AND <condition> AND . . . AND <condition>

Where each condition is of the form Ai Θ Bj ,

Ai is an attribute of R, Bj is an attribute of S (same domain) and


Θ = { =,<,>,≤,≥,≠}.
The EQUIJON and NATURAL JOIN:

• A JOIN, where the only comparison operator used is =, is called an EQUIJOIN.

• In the result of an EQUIJOIN we always have one or more pairs of attributes

that have identical values in every tuple. To avoid the second attribute we use

NATURAL JOIN.

• NATURAL JOIN requires that the two join attributes (or each pair of join
attributes) have the same name in both relations.

• To have same attribute name in both the relations , must rename one

of the two attributes using rename operator ( ρ ).

• NATURAL JOIN of R and S is denoted by


R*S
3.5 The Relational Algebra

Binary Relational Operations : JOIN and DIVISION

The EQUIJON and NATURAL JOIN: Example

PROJ_DEPT = PROJECT * ρ (DNAME, DNUM,MGRSSN,MGRSTARTDATE)(DEPARTMENT)


DNUMBER of DEPARTMENT is renamed as DNUM

In the resultant relation only one DNUM is selected and displayed.


More about JOIN

• A more general but non-standard definition for NATURAL JOIN is

Q = R *(<list1>),(<list2>)S

• In general, if R has nR tuples and S has nS tuples, the result of a JOIN

operation R <join condition> S will have between zero and nR * nS tuples.

• The expected size of the join result divided by the maximum size n R *

nS leads to a ratio called join selectivity, which is a property of each


join condition.
More about JOIN

• If there is no join condition, all combinations of tuples qualify and the


JOIN becomes a CARTESIAN PRODUCT, also called CROSS
PRODUCT or CROSS JOIN.

• The natural join or EQUIJON operation can also be specified among

multiple tables, leading to an n-way join.

((PROJECT DNUM=DNUMBER DEPARTMENT) MGRSSN=SSN EMPLOYEE)


Complete Set of Relational Operations

• The set of operations { ,  , , - , X} is called


a complete set because any other relational
algebra expression can be expressed by a
combination of these five operations.

For example:
R  S ≡ (R  S ) – ((R - S)  (S - R))
R <join condition> S ≡  <condition> (R X S)
DIVISION
Let Z be the attributes of R

Let X be the attributes of S and X is subset of Z.

Then division operation of R and S is denoted by :T(Y) = R(Z)  S(X),

where the attributes in the resultant relation T are the attributes of R


that are not attributes of S.

i.e Y = Z - X ( and hence Z = X  Y )


AND ,

the tuples of the resultant relation T are such that each tuple of
T must appear in R in combination with every tuple in S.
DIVISION
DIVISION : Example
T = R  S

Attributes of T = attr. Of R - attr. Of S


= {A,B} - {A}
= { B}

In general , the result of DIVISION is a relation T(Y) that includes a tuple t


if tuples tR appear in R with tR [Y] = t, and with tR [X] = ts for every tuple ts in
S.
DIVISION

Retrieve the names of employees who work on all the projects

T1(SSN,PNO)  π ESSN,PNO (WORKS_ON)

T2  π PNO (WORKS_ON)
T3  (T1  T2 )
RESULT  π FNAME,MINT,LNAME (T3 * EMPLOYEE)
DIVISION

Retrieve the names of employees who work on all the projects that
john smith works on .
DIVISION

Solution :
Step 1 : Retrieve the list project numbers that john smith works on
and store those project numbers in SMITH_PNOS temporary
relation.

SMITH  FNAME = ‘John’ AND LNAME = ‘Smith’ (EMPLOYEE)

SMITH_PNOS  π PNO ( WORKS_ON ESSN=SSN SMITH)


DIVISION
:
Step 2 : Select ESSN and PNO tuples from WORKS_ON relation and store in SSN_PNOS
relation.
SSN_PNOS  π ESSN , PNO ( WORKS_ON)
DIVISION
Step 3 : Finally, apply DIVISION operation to the two relation, which gives the

desired employees (SSN) working on all the projects that john smith works on.

SSNS(SSN )  SSN_PNOS  SMITH_PNOS


DIVISION
We can then select Names of the employees with these SSN by using natural JOIN of
SSNS and EMPLOYESS and selecting the desired columns.

RESULT  π FNAME , LNAME (SSNS * EMPLOYEE)

So following is the sequence of relation expressions for the query

1. SMITH  FNAME = ‘John’ AND LNAME = ‘Smith’ (EMPLOYEE)


2. SMITH_PNOS  π PNO ( WORKS_ON ESSN=SSN SMITH)

3. SSN_PNOS  π ESSN , PNO ( WORKS_ON)


4. SSNS(SSN )  SSN_PNOS  SMITH_PNOS

5. RESULT  π FNAME , LNAME (SSNS * EMPLOYEE)


Aggregate Functions
• Include Common functions applied to collections of numeric values.

SUM, AVERAGE, MAXIMUM, and MINIMUM. and COUNT

• An AGGREGATE FUNCTION operation is defined using the symbol


ℱ (pronounced "script F")

Example: Find the no. of employees and avg salary of employees

ℱCOUNT SSN , AVERAGE SALARY(EMPLOYEE)


Aggregate Functions

ℱMAX (Employee) retrieves the maximum salary value from


Salary

the Employee relation (55000)

ℱMIN (Employee) retrieves the minimum Salary value from the


Salary

Employee relation (25000)

ℱSUM (Employee) retrieves the sum of the Salary from the


Salary

Employee relation
Grouping

• Groups the tuples in a relation by the value of some of their


attributes and then applying an aggregate function independently to
each group.

General form :

<grouping attributes> ℱ<function list> (R)


Grouping
• Example : To group employees by DNO (department number) and
compute the count of employees and average salary per department.
DNO ℱCOUNT SSN, AVERAGE Salary ( EMPLOYEE)

Note : duplicates are not eliminated when an aggregate function is applied


Examples of Queries on Company database
Examples of Queries on Company database

Q1: Retrieve the name and address of all employees who work
for the ‘Research’ department.

RESEARCH_DEPT   DNAME=’Research’ (DEPARTMENT)

RESEARCH_EMPS  (RESEARCH_DEPT DNUMBER= DNO EMPLOYEE)

RESULT   FNAME, LNAME, ADDRESS(RESEARCH_EMPS)


Examples of Queries on Company database
Examples of Queries on Company database

Q2: For every project located in ‘Stafford’, list the project number, the controlling

department number, and the department manager’s last name, address, and birthrate

STAFFORD_PROJS =  PLOCATION=’Stafford’ (PROJECT)

CONTR_DEPT = (STAFFORD_PROJS DNUM=DNUMBER DEPARTMENT)

PROJ_DEPT_MGR = (CONTR_DEPT MGRSSN=SSN EMPLOYEE)

RESULT =  PNUMBER, DNUM, LNAME, ADDRESS, BDATE (PROJ_DEPT_MGR)


Examples of Queries on Company database
Examples of Queries on Company database

Q3: Find the names of employees who work on all the projects
controlled by department number 5.

DEPT5_PROJS(PNO)=  PNUMBER( DNUM= 5(PROJECT))

EMP_PRJO(SSN, PNO) =  ESSN, PNO(WORKS_ON)

RESULT_EMP_SSNS = EMP_PRJO ÷ DEPT5_PROJS

RESULT =  LNAME, FNAME(RESULT_EMP_SSNS * EMPLOYEE)


Examples of Queries on Company database
Examples of Queries on Company database

Q4: Make a list of project numbers for projects that involve an employee
whose last name is ‘Smith’, either as a worker or as a manager of the
department that controls the project.

SMITHS(ESSN) =  SSN ( LNAME=’Smith’ (EMPLOYEE))

SMITH_WORKER_PROJ =  PNO (WORKS_ON * SMITHS)

MGRS =  LNAME, DNUMBER (EMPLOYEE SSN=MGRSSN DEPARTMENT)

SMITH_MANAGED_DEPTS(DNUM) = DNUMBER(LNAME= ’Smith’(MGRS))

SMITH_MGR_PROJS(PNO)=PNUMBER (SMITH_MANAGED_DEPTS * PROJECT)

RESULT = (SMITH_WORKER_PROJS U SMITH_MGR_PROJS)


Examples of Queries on Company database
Examples of Queries on Company database

Q5: List the names of all employees with two or more


dependents.

T1(SSN, NO_OF_DEPTS) = ESSN ℱ COUNT DEPENDENT_NAME(DEPENDENT)

T2 =  NO_OF_DEPS ≥2 (T1)

RESULT =  LNAME, FNAME (T2 * EMPLOYEE)


Examples of Queries on Company database
Examples of Queries on Company database

Q6: Retrieve the names of employees who have no dependents.

ALL_EMPS   SSN(EMPLOYEE)

EMPS_WITH_DEPS(SSN)   ESSN (DEPENDENT)

EMPS_WITHOUT_DEPS  (ALL_EMPS - EMPS_WITH_DEPS)

RESULT   LNAME, FNAME (EMPS_WITHOUT_DEPS * EMPLOYEE)


Examples of Queries on Company database
Examples of Queries on Company database

Q7: List the names of managers who have at least one


dependent.

MGRS(SSN) =  MGRSSN (DEPARTMENT)

EMPS_WITH_DEPS(SSN) =  ESSN(DEPENDENT)

MGRS_WITH_DEPS = (MGRS ∩ EMPS_WITH_DEPS)

RESULT =  LNAME, FNAME(MGRS_WITH_DEPS * EMPLOYEE)


Recursive Closure Operations
• This operation is applied to a recursive relationship between tuples of the same
type, such as the relationship between an employee and a supervisor.

Example : To retrieve the SSNs of all employees directly supervised—at level one
—by the employee e whose name is ‘James Borg’
Recursive Closure Operations
To retrieve the SSNs of all employees directly supervised—at level one—by the
employee e whose name is ‘James Borg’
BORG_SSN = π SSN (σFNAME=’James’ AND LNAME=’Borg’ (EMPLOYEE))

SUPERVISION(SSN1, SSN2) = πSSN, SUPERSSN(EMPLOYEE)

RESULT1(SSN) = πSSN1 (SUPERVISION SSN2=SSN BORG_SSN) (At level 1)


Recursive Closure Operations

To retrieve the SSNs of all employees supervised—at level two —by ‘James Borg’ ,
repeat the same operation
RESULT2(SSN) = πSSN1 (SUPERVISION SSN2=SSN RESULT1) (level 2)
Recursive Closure Operations

To get both sets of employees supervised at levels 1 and 2 by 'James Borg,'


3.7 Relational Data base design using ER-to-Relational
Mapping Algorithm

ER-to-Relational Mapping Algorithm

Step 1: Mapping of Regular Entity Types

Step 2: Mapping of Weak Entity Types

Step 3: Mapping of Binary 1:1 Relation Types

Step 4: Mapping of Binary 1:N Relationship Types.

Step 5: Mapping of Binary M:N Relationship Types.

Step 6: Mapping of Multivalued attributes.

Step 7: Mapping of N-ary Relationship Types.


FIGURE 7.1
The ER
conceptual
schema
diagram for
the COMPANY
database.
FIGURE 7.2
Result of
mapping the
COMPANY ER
schema into a
relational
schema.
Step 1: Mapping of Regular Entity Types.

– For each regular (strong) entity type E in the ER schema, create a


relation R that includes all the simple attributes of E.

– Choose one of the key attributes of E as the primary key for R. If the
chosen key of E is composite, the set of simple attributes that form it will
together form the primary key of R.

EMPLOYEE FNAME MNAME LNAME SSN BDATE ADDRESS SEX SALARY

DEPARTMENT DNAME DNUMBER

PNAME PNUMBER PLOCATION


PROJECT
Step 2: Mapping of Weak Entity Types

– For each weak entity type W in the ER schema with owner entity type E,
create a relation R and include all simple attributes (or simple components
of composite attributes) of W as attributes of R.
– In addition, include as foreign key attributes of R the primary key
attribute(s) of the relation(s) that correspond to the owner entity type(s).
– The primary key of R is the combination of the primary key(s) of the
owner(s) and the partial key of the weak entity type W, if any.

EMPLOYEE FNAME MNAME LNAME SSN BDATE ADDRESS SEX SALARY

DEPARTMENT DNAME DNUMBER

PROJECT PNAME PNUMBER PLOCATION

DEPNDENT ESSN DEPENDENT_NAME SEX BDATE RELATIONSHIP


Step 3: Mapping of Binary 1:1 Relation Types

For each binary 1:1 relationship type R in the ER schema, identify


the relations S and T that correspond to the entity types
participating in R.
• There are three possible approaches:

1. Foreign Key approach


2. Merged relation option
3. Cross-reference or relationship relation option:
Step 3: Mapping of Binary 1:1 Relation Types

Foreign Key approach:


– Choose one of the relations-S, say-and include as a foreign key in
S the primary key of T. It is better to choose an entity type with
total participation in R in the role of S. Include all the simple
attributes of the 1:1 relationship type R as attributes of S.

Example: 1:1 relation MANAGES is mapped by choosing the participating entity


type DEPARTMENT to serve in the role of S, because its participation in the
MANAGES relationship type is total.
Step 3: Mapping of Binary 1:1 Relation Types

Step 3 RESULT

FNAME MNAME LNAME SSN BDATE ADDRESS SEX SALARY


EMPLOYEE

DNAME DNUMBER MGRSSN MGSSTARTDATE


DEPARTMENT

PROJECT PNAME PNUMBER PLOCATION

DEPNDENT ESSN DEPENDENT_NAME SEX BDATE RELATIONSHIP


Step 4: Mapping of Binary 1:N Relationship Types.

– For each regular binary 1:N relationship type R, identify the


relation S that represent the participating entity type at the
N-side of the relationship type.
– Include as foreign key in S the primary key of the relation T
that represents the other entity type participating in R.
– Include any simple attributes of the 1:N relation type as
attributes of S.
Step 4: Mapping of Binary 1:N Relationship
Types.
• Step 4:

FNAME MNAME LNAME SSN BDATE ADDRESS SEX SALARY SUPERSSN DNO
EMPLOYEE

DEPARTMENT DNAME DNUMBER MGRSSN MGSSTARTDATE

PROJECT PNAME PNUMBER PLOCATION DNUM

ESSN DEPENDENT_NAME SEX BDATE RELATIONSHIP


DEPNDENT
Step 5: Mapping of Binary M:N Relationship
Types.

– For each regular binary M:N relationship type R, create a new


relation S to represent R.
– Include as foreign key attributes in S the primary keys of the
relations that represent the participating entity types; their
combination will form the primary key of S.
– Also include any simple attributes of the M:N relationship type (or
simple components of composite attributes) as attributes of S.
Step 5: Mapping of Binary M:N Relationship Types
STEP : 5 Result

FNAME MNAME LNAME SSN BDATE ADDRESS SEX SALARY SUPERSSN DNO
EMPLOYEE

DEPARTMENT
DNAME DNUMBER MGRSSN MGSSTARTDATE

PROJECT
PNAME PNUMBER PLOCATION DNUM

WORKS_ON ESSN PNO HOURS

DEPNDENT ESSN DEPENDENT_NAME SEX BDATE RELATIONSHIP


Step 6: Mapping of Multivalued attributes

- For each multivalued attribute A, create a new relation R. This


relation R will include an attribute corresponding to A, plus the primary
key attribute K-as a foreign key in R-of the relation that represents the
entity type of relationship type that has A as an attribute.

– The primary key of R is the combination of A and K. If the


multivalued attribute is composite, we include its simple
components.
Step 6: Mapping of Multivalued attributes

Example: The relation DEPT_LOCATIONS is created. The attribute

DLOCATION represents the multivalued attribute LOCATIONS of

DEPARTMENT, while DNUMBER-as foreign key-represents the primary key

of the DEPARTMENT relation. The primary key of R is the combination of

{DNUMBER, DLOCATION}.
Step 6: Mapping of Multivalued attributes

STEP : 6

FNAME MNAME LNAME SSN BDATE ADDRESS SEX SALARY SUPERSSN DNO
EMPLOYEE

DEPARTMENT
DNAME DNUMBER MGRSSN MGSSTARTDATE

DEPT_LOCATIONS
DNUMBER DLOCATON

PNAME PNUMBER PLOCATION DNUM


PROJECT

WORKS_ON ESSN PNO HOURS

DEPNDENT ESSN DEPENDENT_NAME SEX BDATE RELATIONSHIP


Step 7: Mapping of N-ary Relationship Types

– For each n-ary relationship type R, where n>2, create a new


relationship S to represent R.
– Include as foreign key attributes in S the primary keys of the
relations that represent the participating entity types.
– Also include any simple attributes of the n-ary relationship type
(or simple components of composite attributes) as attributes of
S.
Step 7: Mapping of N-ary Relationship Types.

Example
Assignment – II
Specify the following queries on the company database schema,
using the relational operators.

a. Retrieve the names of all employees in department 5 who work more


than 10 hours per week on the ‘ProductX’ project.

b. List the names of all employees who have a dependent with the same first
name as themselves.

c. Find the names of all employees who are directly supervised by ‘Franklin
Wong’.

d. For each project, list the project name and the total hours per week (by all
employees) spent on that project.

e. Retrieve the names of all employees who work on every project.


Assignment - II
f . Retrieve the names of all employees who do not work on any
project.

g. For each department, retrieve the department name and the


average salary of all employees working in that
department.

h. Retrieve the average salary of all female employees.

i. Find the names and addresses of all employees who work


on at least one project located in Houston but whose
department has no location in Houston.

j. List the last names of all department managers who have no


dependents.
Mapping Exercise-1
Mapping Exercise-2
An ER schema for a SHIP_TRACKING database.

You might also like