Part2 - Ch5 - Relational Database Design by ER and ERR To Relational Mapping
Part2 - Ch5 - Relational Database Design by ER and ERR To Relational Mapping
Outline:
1
5.1 Relational Model Concepts
A relational database is basically a set of relations (tables).
A Relation is a mathematical concept based on the ideas of sets.
The model was first introduced by Tod Codd of IBM Research in 1970 in the following paper: "A
Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970.
Informal Definitions
Example of a Relation
Formal Definitions
1. Schema
o The Schema (or description) of a Relation:
Denoted by R(A1, A2, .....An)
R is the name of the relation
The attributes of the relation are A1, A2, ..., An
o Example:
CUSTOMER (Cust-id, Cust-name, Address, Phone#)
CUSTOMER is the relation name
Defined over the four attributes: Cust-id, Cust-name, Address, Phone#
2
o For example, the domain of Cust-id is 6 digit numbers.
o The data domain may have a data-type and/or format.
2. Tuple
3. State
o The relation state is a subset of the Cartesian product of the domains of its attributes
o Each domain contains the set of all possible values the attribute can take.
o Example: attribute Cust-name is defined over the domain of character strings of maximum
length 25
dom(Cust-name) is varchar(25)
4. Summary
Characteristics of Relations:
1. Ordering of tuples in a relation r(R): The tuples are not considered to be ordered, even though they
appear to be in the tabular form.
2. Ordering of attributes in a relation schema R (and of values within each tuple): We will consider
the attributes in R(A1, A2, ...,An) and the values in t=<v1, v2, ..., vn> to be ordered.
3. Values in a tuple:
All values are considered atomic (indivisible).
3
Each value in a tuple must be from the domain of the attribute for that column
If tuple t = <v1, v2, …, vn> is a tuple (row) in the relation state r of R(A1, A2, …,
An)
Then each vi must be a value from dom(Ai)
A special null value is used to represent values that are unknown or unapplicable to certain
tuples.
Example (1):
Consider the EMPLOYEE relation schema:
Employee (SSN,Empid,Fname,Mname,Lname,Add,Age)
1. Superkey of R:
{SSN},{Empid},{SSN,Empid},{SSN,Fname},{SSN,Mname},{SSN,Lname},{SSN,Add} ,
{SSN,Age},{ SSN,Empid,Fname},{ SSN,Empid,Lname},{ SSN,Empid,Mname},{
SSN,Empid,Add},{ SSN,Empid,Age},{Fname, Mname, Lname},…etc.
2. Key of R:
{SSN}, {Empid},{Fname,Mname,Lname}
3. Primary key:
{SSN}
4. Alterative key:
{Empid},{Fname,Mname,Lname}
4
Example (2):
Consider the following relation:
X Y Z
1 A 20
3 B 21
4 A 15
1 C 30
1. Superkey of R:
{Z},{X,Y},{X,Z},{Y,Z},[X,Y,Z}
2. Key of R:
{Z}, {X, Y}
3. Primary key:
{Z}
4. Alterative key:
{X, Y}
Example (3):
Consider relation (schema) Pay(Grade, Salary)
Pay Grade Salary
8 900
9 700
7 1300
We are told that for any instance of Pay, any two tuples that are equal on Grade
are (completely) equal
o Of course, if each Grade appears in only one tuple, this is automatically
true
The Grade is a key, but the Salary it is not a key
o because we are not told that any two tuples that are equal on Salary are
equal on Grade in every instance of Pay
Find superkeys, candidate keys, primary keys, alternative keys in the following CAR relation
schema?
CAR (State, Reg#, SerialNo, Model, Year)
5
Find superkeys, candidate keys, primary keys, alternative keys in the following
relation?
A B C D E
A1 Hh 101 D10 10
A2 Mm 200 D30 20
A3 Ff 50 D90 30
A4 Ll 60 D53 30
A1 Ff 70 D6 10
5. Foreign key
A foreign key is a column or a combination of columns that is used to establish and enforce
a link between the data in two tables.
It acts as a cross-reference between tables because it references the primary key of another
table.
6
5.3 ER-to-Relational Mapping
7
Figure 2: Result of mapping the COMPANY ER schema into a relational schema.
For each regular (strong) entity set E in the ER schema, create a relation R that includes all
the simple attributes of E.
Choose one of the key attributes of E as the primary key for R. If the chosen key of E is
composite, the set of simple attributes that form it will together form the primary key of R.
Example: We create the relations EMPLOYEE, DEPARTMENT, and PROJECT in the
relational schema corresponding to the regular entities in the ER diagram. SSN,
DNUMBER, and PNUMBER are the primary keys for the relations EMPLOYEE,
DEPARTMENT, and PROJECT as shown.
For each weak entity set W in the ER schema with owner entity set E, create a relation R
and include all simple attributes (or simple components of composite attributes) of W as
attributes of R.
In addition, include as foreign key attributes of R the primary key attribute(s) of the
relation(s) that correspond to the owner entity set(s).
8
The primary key of R is the combination of the primary key(s) of the owner(s) and the
partial key of the weak entity set W, if any.
Example: Create the relation DEPENDENT in this step to correspond to the weak entity
set DEPENDENT. Include the primary key SSN of the EMPLOYEE relation as a foreign
key attribute of DEPENDENT (renamed to ESSN).
The primary key of the DEPENDENT relation is the combination {ESSN,
DEPENDENT_NAME} because DEPENDENT_NAME is the partial key of
DEPENDENT.
Note: It is common to choose the CASCADE option for the referential triggered action on
the foreign key in the relation corresponding to the weak entity type, since a weak entity
has an existence dependency on its owner entity. This can be used for both ON UPDATE
and ON DELETE.
For each binary 1:1 relationship set R in the ER schema, identify the relations S and T that
correspond to the entity sets participating in R.
Choose one of the relations-S, say-and include a foreign key in S that references the
primary key of T. It is better to choose an entity set with total participation in R in the role
of S.
Include any simple attributes (or components of composite attributes) of the 1:1
relationship set as attributes of S.
Example: 1:1 relation MANAGES is mapped by choosing the participating entity set
DEPARTMENT to serve in the role of S, because its participation in the MANAGES
relationship set is total.
Note: an alternative mapping of a 1:1 relationship type is possible by merging the two
entity types and the relationship into a single relation. This is appropriate when both
participations are total.
For each regular binary 1:N relationship set R, identify the relation S that represent the
participating entity set at the N-side of the relationship set.
Include as foreign key in S the primary key of the relation T that represents the other entity
set participating in R.
9
Include any simple attributes (or components of composite attributes) of the 1:N
relationship Set as attributes of S.
For each regular binary M:N relationship set R, create a new relation S to represent R.
Include as foreign key attributes in S the primary keys of the relations that represent the
participating entity sets; their combination will form the primary key of S.
Also include any simple attributes of the M:N relationship set (or simple components of
composite attributes) as attributes of S.
Example: The M:N relationship set WORKS_ON from the ER diagram is mapped by
creating a relation WORKS_ON in the relational database schema. The primary keys of the
PROJECT and EMPLOYEE relations are included as foreign keys in WORKS_ON and
renamed PNO and ESSN, respectively.
Attribute HOURS in WORKS_ON represents the HOURS attribute of the relation set. The
primary key of the WORKS_ON relation is the combination of the foreign key attributes
{ESSN, PNO}.
Note: we can always map 1:1 or 1:N relationships in a manner similar to M:N
relationships. This alternative is particularly useful when few relationship instances exist,
in order to avoid null values in foreign keys. In this case, the primary key of the
"relationship" relation will be only one of the foreign keys that reference the participating
"entity" relations. For a 1:N relationship, this will be the foreign key that references the
entity relation on the N-side. For a 1:1 relationship, the foreign key that references the
entity relation with total participation (if any) is chosen as primary key.
For each multivalued attribute A, create a new relation R. This relation R will include an
attribute corresponding to A, plus the primary key attribute K-as a foreign key in R-of the
relation that represents the entity set of relationship set that has A as an attribute.
The primary key of R is the combination of A and K. If the multivalued attribute is
composite, we include its simple components.
10
as foreign key-represents the primary key of the DEPARTMENT relation. The primary key
of R is the combination of {DNUMBER, DLOCATION}.
For each n-ary relationship set R, where n>2, create a new relation S to represent R.
Include as foreign key attributes in S the primary keys of the relations that represent the
participating entity sets.
Also include any simple attributes of the n-ary relationship set (or simple components of
composite attributes) as attributes of S.
Example: The relationship set SUPPLY in the following ER diagram. This can be mapped
to the relation SUPPLY shown in the relational schema, whose primary key is the
combination of the three foreign keys {SNAME, PARTNO, PROJNAME}
Figure 4: Mapping the n-ary relationship set SUPPLY from Figure 3.a.
11
Summary of Mapping constructs and constraints
12
To build our intuition, let‟s look at some specific instance of our application.
There are four countries, listing for them: Cname, Population (the later only when
known):
o We create a table for Country “in the most obvious way,” by creating a
column for each attribute (underlying the attributes of the primary key)
There are five animals, listing for them: Species, Discovered (note, that even though
not required, Discovered happens to be known for every Species):
o We create a table for Animal as before
There are five employees, listing for them: ID#, Name, (name of) Child (note there
may be any number of Child values for an Employee, zero or more):
o We create a table for Employee in the most obvious way, and this does not
work:
13
o If we are ready to store up to 25 children for an employee and create a table
with 25 columns for children, perhaps tomorrow we get an employee with
26 children, who will not “fit”
o We replace our attempted single table for Employee by two tables
One for all the attributes of Employee other than the multivalued one
(Child)
One for pairs of the form (primary key of Employee, Child)
o Note that both tables have a fixed number of columns, no matter how many
children an employee has
14
Note that there are foreign key constraints
ID# appearing in Likes is a foreign key referencing
Employee
Species appearing in Likes is a foreign key referencing
Animal
Born needs to specify which employees were born in which countries (for whom
this information is known)
o Such specification can done using the primary keys of the entities/tables
o The relation Born contains some tuples:
o The above discussion implies that for every row in Employee there is at
most one “relevant” row of Born
o Therefore, the “extra” information about an employee that is currently
stored in Born can be added to Employee
o Born table can be removed from the design
o This sounds very formal, but intuitively very clear as we can see from an
alternative design
15
Replace
By
16
5.4 EER-to-Relational Mapping
Convert each specialization with m subclasses {S1, S2,….,Sm} and generalized superclass
C, where the attributes of C are {k,a1,…an} and k is the (primary) key, into relational
schemas using one of the four following options:
17
Option 8B: Multiple relations-Subclass relations only
o Create a relation Li for each subclass Si, 1 < i < m, with the attributes
Attr(Li) = {attributes of Si} U {k,a1…,an} and PK(Li) = k. This option only
works for a specialization whose subclasses are total (every entity in the
superclass must belong to (at least) one of the subclasses) and disjoint
constraint.
18
Figure 7: Option 8C Example.
19
Example: Convert the following EER diagram into Relational Database
20