DBS-er Model
DBS-er Model
An entity set is a set of entities of the same type that share the same properties.
– Example: set of all persons, companies, trees, holidays
Entity Sets customer and loan
For instance, the depositor relationship set between entity sets customer and
account may have the attribute access-date
Degree of a Relationship Set
Relationship sets that involve two entity sets are binary (or degree two).
Generally, most relationship sets in a database system are binary.
Relationships between more than two entity sets are rare. Most relationships
are binary.
E-R Diagram with a Ternary Relationship
Attributes
An entity is represented by a set of attributes, that is descriptive properties
possessed by all members of an entity set.
Example:
customer = (customer_id, customer_name, customer_street, customer_city )
loan = (loan_number, amount )
Attribute types:
– Simple and composite attributes.
– Single-valued and multi-valued attributes
• Example: multivalued attribute: phone_numbers
– Derived attributes
• Can be computed from other attributes
• Example: age, given date_of_birth
Composite Attributes
Relationship Sets with Attributes
12
E-R Diagram With Composite, Multivalued, and Derived Attributes
Mapping Cardinality Constraints
Express the number of entities to which another entity can be associated via a
relationship set.
For a binary relationship set the mapping cardinality must be one of the
following types:
– One to one
– One to many
– Many to one
– Many to many
Mapping Cardinalities…contd.
Lines link attributes to entity sets and entity sets to relationship sets.
One-To-Many Relationship
• In the one-to-many relationship a loan is associated with at most one customer via borrower,
a customer is associated with several (including 0) loans via borrower
Roles
• Entity sets of a relationship need not be distinct.
• The labels “manager” and “worker” are called roles; they specify how employee entities
interact via the works_for relationship set.
• Roles are indicated in E-R diagrams by labeling the lines that connect diamonds to
rectangles.
• Role labels are optional, and are used to clarify semantics of the relationship
Cardinality Constraints
• We express cardinality constraints by drawing either a directed line (->), signifying
“one,” or an undirected line (—), signifying “many,” between the relationship set and
the entity set.
• One-to-one relationship:
– A customer is associated with at most one loan via the relationship borrower
– A loan is associated with at most one customer via borrower
One-To-Many Relationship
• In the one-to-many relationship a loan is associated with at most one customer via borrower,
a customer is associated with several (including 0) loans via borrower
Many-To-One Relationships
• In a many-to-one relationship a loan is associated with several (including 0) customers
via borrower, a customer is associated with at most one loan via borrower
Many-To-Many Relationship
A customer is associated with several (possibly 0) loans via borrower
A loan is associated with several (possibly 0) customers via borrower
Participation of an Entity Set in a Relationship Set
Total participation (indicated by double line): every entity in the entity set participates in at
least one relationship in the relationship set
E.g. participation of loan in borrower is total
every loan must have a customer associated to it via borrower
Partial participation: some entities may not participate in any relationship in the relationship
set
Example: participation of customer in borrower is partial
Alternative Notation for Cardinality Limits
Cardinality limits can also express participation constraints
Weak Entity Sets
An entity set that does not have a primary key is referred to as a weak entity set.
The existence of a weak entity set depends on the existence of a identifying entity
set
– It must relate to the identifying entity set via a total, one-to-many relationship
set from the identifying to the weak entity set
– Identifying relationship depicted using a double diamond
The discriminator (or partial key) of a weak entity set is the set of attributes that
distinguishes among all the entities of a weak entity set.
The primary key of a weak entity set is formed by the primary key of the strong
entity set on which the weak entity set is existence dependent, plus the weak entity
set’s discriminator.
Weak Entity Sets … .contd
Primary keys allow entity sets and relationship sets to be expressed uniformly
as relation schemas that represent the contents of the database.
For each entity set and relationship set there is a unique schema that is assigned
the name of the corresponding entity set or relationship set.
A weak entity set becomes a table that includes a column for the primary key
of the identifying strong entity set
payment =
( loan_number, payment_number, payment_date, payment_amount )
For one-to-one relationship sets, either side can be chosen to act as the
“many” side.
- That is, extra attribute can be added to either of the tables corresponding
to the two entity sets.
The schema corresponding to a relationship set linking a weak entity set to its
identifying strong entity set is redundant.
Example: The payment schema already contains the attributes that would
appear in the loan_payment schema (i.e., loan_number and
payment_number).
Redundancy of Schemas
Many-to-one and one-to-many relationship sets that are total on the many-side can be
represented by adding an extra attribute to the “many” side, containing the primary key
of the “one” side
– Each value of the multivalued attribute maps to a separate tuple of the relation on schema
EM
For example, an employee entity with primary key 123-45-6789 and dependents Jack
and Jane maps to two tuples:
(123-45-6789 , Jack) and (123-45-6789 , Jane)
Extended E-R Features: Specialization
• Attribute inheritance – a lower-level entity set inherits all the attributes and
relationship participation of the higher-level entity set to which it is linked.
Specialization
Extended ER Features: Generalization
A bottom-up design process – combine a number of entity sets that share the same
features into a higher-level entity set.
Specialization and generalization are simple inversions of each other; they are
represented in an E-R diagram in the same way.
User Defined Subclass: When we do not have a condition for determining membership in
a subclass, the subclass is called user-defined. Membership in such a subclass is
determined by the database users when they apply the operation to add an entity to the
subclass
A total specialization constraint specifies that every entity in the superclass must be a
member of at least one subclass in the specialization. A double line is used to display a
partial specialization
Partial Participation : Which allows an entity not to belong to any of the subclasses. A single line is
used to display a partial specialization
Aggregation
• Method 1:
– Form a schema for the higher-level entity
– Form a schema for each lower-level entity set, include primary key
of higher-level entity set and local attributes
schema attributes
person name, street, city
customer name, credit_rating
employee name, salary
schema attributes
person name, street, city
customer name, street, city, credit_rating
employee name, street, city, salary
– If specialization is total, the schema for the generalized entity set (person)
not required to store information
• Can be defined as a “view” relation containing union of specialization
relations
• But explicit schema may still be needed for foreign key constraints
– Drawback: street and city may be stored redundantly for people who are
both customers and employees
E-R Design Decisions
The use of aggregation – can treat the aggregate entity set as a single unit without
concern for the details of its internal structure.
E-R Diagram for a Banking Enterprise
Data Base Design
Relational Database Design
Database-Design Process
No repetition but we have to create tuple with null value for amount
Design Alternatives: Smaller Schemas
Suppose we had started with bor_loan. How would we know to split up (decompose) it
into borrower and loan?
Write a rule “if there were a schema (loan_number, amount), then loan_number
would be a candidate key”
In bor_loan, because loan_number is not a candidate key, the amount of a loan may have
to be repeated. This indicates the need to decompose bor_loan.
The next slide shows how we lose information - we cannot reconstruct the original
employee relation -- and so, this is a lossy decomposition.
A Lossy Decomposition
Guideline 1
• Design a relation schema so that it is easy to explain its meaning. Do
not combine attributes from multiple entity types and relationship
types into a single relation. Intuitively, If a relation schema
corresponds to one entity type or one relationship type, it is
straightforward to interpret and to explain its meaning . Otherwise, if
the relation corresponds to mixture of multiple entities and
relationships, Semantic ambiguities will result and the relation cannot
be easily explained
First Normal Form
• Atomicity is actually a property of how the elements of the domain are used.
– E.g.: Strings would normally be considered indivisible
– Suppose that students are given roll numbers which are strings of the form
CS0012 or EE1127
– If the first two characters are extracted to find the department, the domain
of roll numbers is not atomic.
In the case that a relation R is not in “good” form, decompose it into a set of
relations {R1, R2, ..., Rn} such that:
– Functional dependencies
– Multivalued dependencies
Functional Dependencies
Require that the value for a certain set of attributes determines uniquely the
value for another set of attributes.
A B C D
a1 b1 c1 d1
a1 b2 c1 d2
a2 b2 c2 d2
a2 b3 c2 d3
a3 b3 c2 d4
A B C
a1 b1 c1
a1 b1 c2
a2 b1 c1
a2 b1 c3
A B C D
a1 b1 c1 d1
a1 b2 c1 d2
a2 b2 c2 d2
a2 b3 c2 d3
a3 b3 c2 d4
In the above relation r the following functional dependency holds:
A C
A B C
a1 b1 c1
a1 b1 c2
a2 b1 c1
a2 b1 c3
FD1
FD2
FD3
F+ is a superset of F.
Procedure for Computing F+
• To compute the closure of a set of functional dependencies F:
F+=F
repeat
for each functional dependency f in F+
apply reflexivity and augmentation rules on f
add the resulting functional dependencies to F +
for each pair of functional dependencies f1and f2 in F +
if f1 and f2 can be combined using transitivity
then add the resulting functional dependency to F +
until F + does not change any further
Closure of a Set of Functional Dependencies
Given a set F set of functional dependencies, there are certain other functional
dependencies that are logically implied by F.
– For example: If A B and B C, then we can infer that A C
The set of all functional dependencies logically implied by F is the closure of F.
We denote the closure of F by F+.
Compute Members of F+
Ssn+ {Ssn,Ename }
{Pnumber}+= {pnumber, Pname, Plocation}
{ssn,Pnumber}+= {Ssn, Pnumber, Ename,Pname,Plocation,Hours}
Given R = (A, B, C, G, H, I)
F={ AB
AC
CG H
CG I
B H}
Compute F+.
• R = (A, B, C, G, H, I)
F={ AB
AC
CG H
CG I
B H}
• Members of F+
– AH
• by transitivity from A B and B H
– AG I
• by augmenting A C with G, to get AG CG
and then transitivity with CG I
– CG HI
• by augmenting CG I to infer CG CGI,
and augmenting of CG H to infer CGI HI,
and then transitivity
Compute the closure of the following set F of functional dependencies for relation
schema R= (A, B,C, D, E)
A BC
CDE
B D
EA
First Normal Form
DEPARTMENT
Dname Dnumber Dmgr_ssn Dlocations
- Remove the attribute Dlocations that violates 1NF and place it in a separate relation
DEPT_LOCATIONS along with the primary key Dnumber of DEPARTMENT. The primary key
of this relation is the combination of {Dnumber, Dlocations}
Dnumber Dlocations
- Expand the key so that there will be a separate tuple in the original relation DEPARTMENT
relation for each location of a DEPARTMENT.
Dname Dnumber Dmgr_ssn Dlocations
Research 10 35 {Lab1, Lab2}
Admin 1 25 First floor
- If a maximum number of values known for the attribute then replace the attribute by atomic
attributes. Dlocation2
Dname Dnumber Dmgr_ssn Dlocation1
Research 10 35 Lab1 lab2
Admin 1 25 First floor Null
Insertion Anomalies:
- To insert a new employee tuple into EMP_PROJ, we must include either
the attribute values for the department that the employee works for or
Null.
- It is difficult to insert a new department that has no employees as in the
EMP_DEPT relation. Only option is place Null values for Employee which
is not possible .
Generally real world problems which includes many of the attributes do not apply to
all tuples in the relation, we end up with many NULLs in those tuples. This can waste
space at storage level and may also lead to problems with understanding the meaning
of the attributes.
- The value is known but absent that time it was not recorded .
Guideline 3: As far as possible, avoid placing attributes in a base relation whose
values may frequently the NULL. If NULLs are unavoidable, make sure that they
apply in exceptional cases only and do not apply to a majority of tuples in the relation.
EMP_LOCS
Ename plocation
EMP_PROJ1
– Attribute A is extraneous in if A
and the set of functional dependencies
(F – { }) { ( – A)} logically implies F.
E.g.: Given F = {A C, AB C }
– B is extraneous in AB C because {A C, AB C} logically implies A C (I.e.
the result of dropping B from AB C).
- Each left side of a functional dependency in Fc is unique. That is, there are no
two dependencies
1 1 2 2 in Fc such that 1 2
Canonical Cover
A canonical cover for F is a set of dependencies Fc such that
Consider the following set F of functional dependency on schema (A,B,C) Compute the
canonical cover for F.
A BC
B C
AB
AB C Answer: A B
BC
A → BC
CD → E List the candidate keys for R.
B →D
E→ A
Compute the closure of the following set F of functional dependencies for relation schema R = {A, B,
C, D, E}.
A → BC
CD→ E List the candidate keys for R.
B→D
E→ A
r = R1 (r ) R2 (r )
A decomposition of R into R1 and R2 is lossless join if and only if at least one of the
following dependencies is in F+:
• R1 R2 R1
• R1 R2 R2
E.g.:
Use of Multivalued Dependencies
We use multivalued dependencies in two ways:
1. To test relations to determine whether they are legal under a given set of
functional and multivalued dependencies
R1 R2 R1 R2
OR
R1 R2 R2 R1
Decomposition
a) R1 = (A, B) (R1 is in 4NF)