Lecture 8 1493715884
Lecture 8 1493715884
Lecture #8
2
Outline
Informal design guidelines for relation schemas
Functional dependencies
Normal forms based on primary keys
General definitions of second and third normal forms
Boyce-Codd normal form
Multivalued dependency and fourth normal form
Join dependencies and fifth normal form
Relational database design algorithms
3
Part 1
Dependencies & Normal Forms
4
Introduction
5
Informal Design Guidelines for Relation Schemas
Measures of quality
Making sure attribute semantics are clear
Reducing redundant information in tuples
Reducing NULL values in tuples
Disallowing possibility of generating spurious tuples
6
Imparting Clear Semantics to Attributes in Relations
Semantics of a relation
Meaning resulting from interpretation of attribute values in
a tuple
Easier to explain semantics of relation
Indicates better schema design
7
Guideline 1
8
Guideline 1 (cont.)
9
Redundant Information in Tuples and Update Anomalies
10
Guideline 2
11
NULL Values in Tuples
12
Guideline 3
13
Generation of Spurious Tuples
Figure 15.5(a)
Relation schemas EMP_LOCS and EMP_PROJ1
NATURAL JOIN
Result produces many more tuples than the original set of
tuples in EMP_PROJ
Called spurious tuples
Represent spurious information that is not valid
14
Guideline 4
16
Summary and Discussion of Design Guidelines
17
Functional Dependencies
18
Definition of Functional Dependency
19
Definition of Functional Dependency (cont.)
20
Normal Forms Based on Primary Keys
Normalization process
Approaches for relational schema design
Perform a conceptual schema design using a conceptual
model then map conceptual design into a set of relations
Design relations based on external knowledge derived from
existing implementation of files or forms or reports
21
Normalization of Relations
22
Normalization of Relations (cont.)
23
Practical Use of Normal Forms
24
Definitions of Keys and Attributes Participating in Keys
25
First Normal Form
26
First Normal Form (cont.)
27
28
Second Normal Form
29
Third Normal Form
Problematic FD
Left-hand side is part of primary key
Left-hand side is a nonkey attribute
30
General Definitions of Second and Third Normal Forms
31
General Definitions of Second and Third Normal Forms
(cont.)
Prime attribute
Part of any candidate key will be considered as prime
Consider partial, full functional, and transitive
dependencies with respect to all candidate keys of a
relation
32
General Definition of Second Normal Form
33
34
35
Boyce-Codd Normal Form
Difference:
Condition which allows A to be prime is absent from BCNF
Most relation schemas that are in 3NF are also in BCNF
36
37
Multivalued Dependency and Fourth Normal Form
38
Multivalued Dependency and Fourth Normal Form
(cont.)
39
Join Dependencies and Fifth Normal Form
Join dependency
Multiway decomposition into fifth normal form (5NF)
Very peculiar semantic constraint
Normalization into 5NF is very rarely done in practice
40
41
Part 2
Relational DB Design Algorithms
42
Designing a Set of Relations (1)
43
Designing a Set of Relations (2)
Goals:
Lossless join property (a must)
Algorithm 16.3 tests for general losslessness.
Dependency preservation property
Algorithm 16.5 decomposes a relation into BCNF components by
sacrificing the dependency preservation.
Additional normal forms
4NF (based on multi-valued dependencies)
5NF (based on join dependencies)
44
1. Properties of Relational Decompositions (1)
45
Properties of Relational Decompositions (2)
46
Properties of Relational Decompositions (2)
47
Properties of Relational Decompositions (3)
48
Properties of Relational Decompositions (4)
Claim 1:
It is always possible to find a dependency-preserving
decomposition D with respect to F such that each relation Ri
in D is in 3nf.
49
Properties of Relational Decompositions (5)
50
Properties of Relational Decompositions (6)
51
Properties of Relational Decompositions (7)
52
Lossless (nonadditive) join test for n-ary decompositions.
(a) Case 1: Decomposition of EMP_PROJ into EMP_PROJ1 and
EMP_LOCS fails test.
(b) A decomposition of EMP_PROJ that has the lossless join property.
53
Properties of Relational Decompositions (9)
Lossless (nonadditive) join test for n-ary
decompositions.
(c) Case 2: Decomposition of EMP_PROJ into EMP,
PROJECT, and WORKS_ON satisfies test.
54
Properties of Relational Decompositions (10)
55
Properties of Relational Decompositions (11)
Ri+1, ..., Rm} of R has the non-additive join property with respect
to F.
56
2. Algorithms for Relational Database Schema Design
(1)
57
Algorithms for Relational Database Schema Design (2)
58
Algorithms for Relational Database Schema Design (3)
59
Algorithms for Relational Database Schema Design (4)
60
61
62
63
64
Algorithms for Relational Database Schema Design (8)
65
66
(a) The EMP relation with two MVDs: ENAME —>> PNAME and
ENAME —>> DNAME.
(b) Decomposing the EMP relation into two 4NF relations
EMP_PROJECTS and EMP_DEPENDENTS.
67
(c) The relation SUPPLY with no MVDs is in 4NF but not in 5NF if it has
the JD(R1, R2, R3). (d) Decomposing the relation SUPPLY into the
5NF relations R1, R2, and R3.
68
Multivalued Dependencies and Fourth Normal Form
(3)
Definition:
A multivalued dependency (MVD) X —>> Y specified on relation
schema R, where X and Y are both subsets of R, specifies the following
constraint on any relation state r of R: If two tuples t1 and t2 exist in r such
that t1[X] = t2[X], then two tuples t3 and t4 should also exist in r with the
following properties, where we use Z to denote (R 2 (X υ Y)):
t3[X] = t4[X] = t1[X] = t2[X].
t3[Y] = t1[Y] and t4[Y] = t2[Y].
t3[Z] = t2[Z] and t4[Z] = t1[Z].
An MVD X —>> Y in R is called a trivial MVD if (a) Y is a subset of X,
or (b) X υ Y = R.
69
Multivalued Dependencies and Fourth Normal Form
(4)
Inference Rules for Functional and
Multivalued Dependencies:
IR1 (reflexive rule for FDs): If X Y, then X –> Y.
IR2 (augmentation rule for FDs): {X –> Y} XZ –> YZ.
IR3 (transitive rule for FDs): {X –> Y, Y –>Z} X –> Z.
IR4 (complementation rule for MVDs): {X —>> Y} X —>>
(R – (X Y))}.
IR5 (augmentation rule for MVDs): If X —>> Y and W Z
then WX —>> YZ.
IR6 (transitive rule for MVDs): {X —>> Y, Y —>> Z} X —>> (Z
2 Y).
IR7 (replication rule for FD to MVD): {X –> Y} X —>> Y.
IR8 (coalescence rule for FDs and MVDs): If X —>> Y and there
exists W with the properties that
(a) W Y is empty, (b) W –> Z, and (c) Y Z, then X –> Z.
70
Multivalued Dependencies and Fourth Normal Form
(5)
Definition:
A relation schema R is in 4NF with respect to a set of
dependencies F (that includes functional dependencies and
multivalued dependencies) if, for every nontrivial
multivalued dependency X —>> Y in F+, X is a superkey for
R.
Note: F+ is the (complete) set of all dependencies (functional
or multivalued) that will hold in every relation state r of R that
satisfies F. It is also called the closure of F.
71
72
Multivalued Dependencies and Fourth Normal Form (7)
73
Multivalued Dependencies and Fourth Normal Form (8)
Algorithm 16.7: Relational decomposition into 4NF relations
with non-additive join property
Input: A universal relation R and a set of functional and multivalued
dependencies F.
1. Set D := { R };
2. While there is a relation schema Q in D that is not in 4NF do {
choose a relation schema Q in D that is not in 4NF;
find a nontrivial MVD X —>> Y in Q that violates 4NF;
replace Q in D by two relation schemas (Q - Y) and (X υ Y);
};
74
4. Join Dependencies and Fifth Normal Form (1)
Definition:
A join dependency (JD), denoted by JD(R1, R2, ..., Rn),
specified on relation schema R, specifies a constraint on the
states r of R.
The constraint states that every legal state r of R should have a
non-additive join decomposition into R1, R2, ..., Rn; that is, for
every such r we have
* (R1(r), R2(r), ..., Rn(r)) = r
Note: an MVD is a special case of a JD where n = 2.
A join dependency JD(R1, R2, ..., Rn), specified on relation
schema R, is a trivial JD if one of the relation schemas Ri in
JD(R1, R2, ..., Rn) is equal to R.
75
Join Dependencies and Fifth Normal Form (2)
Definition:
A relation schema R is in fifth normal form (5NF) (or
Project-Join Normal Form (PJNF)) with respect to a
set F of functional, multivalued, and join dependencies
if,
for every nontrivial join dependency JD(R1, R2, ..., Rn) in
F+ (that is, implied by F),
every Ri is a superkey of R.
76
Relation SUPPLY with Join Dependency and
Conversion to Fifth Normal Form
77
5. Inclusion Dependencies (1)
Definition:
An inclusion dependency R.X < S.Y between two sets of
attributes—X of relation schema R, and Y of relation schema S
—specifies the constraint that, at any specific time when r is a
relation state of R and s a relation state of S, we must have
X(r(R)) Y(s(S))
Note:
The ? (subset) relationship does not necessarily have to be a
proper subset.
The sets of attributes on which the inclusion dependency is
specified—X of R and Y of S—must have the same number of
attributes.
In addition, the domains for each pair of corresponding
attributes should be compatible.
78
Inclusion Dependencies (2)
80
Other Dependencies and Normal Forms (2)
81
82
Other Dependencies and Normal Forms (4)
Domain-Key Normal Form (DKNF):
Definition:
A relation schema is said to be in DKNF if all constraints and
dependencies that should hold on the valid relation states can be
enforced simply by enforcing the domain constraints and key
constraints on the relation.
The idea is to specify (theoretically, at least) the “ultimate normal form”
that takes into account all possible types of dependencies and constraints. .
For a relation in DKNF, it becomes very straightforward to enforce all
database constraints by simply checking that each attribute value in a tuple
is of the appropriate domain and that every key constraint is enforced.
The practical utility of DKNF is limited
83
Additional Material (Again)
Logical DB Design
84
Chapter 7
Logical Database
Design
Fundamentals of Database Management Systems,
2nd ed.
by
Mark L. Gillenson, Ph.D.
University of Memphis
7-87
7-87
Logical Database Design
The process of deciding how to arrange
the attributes of the entities in the business
environment into database structures,
such as the tables of a relational
database.
7-92
7-92
One-to-One: Option #1
7-93
7-93
One-to-One: Option #2
Separate tables for the
SALESPERSON and
OFFICE entities, with
Office Number as a
foreign key in the
SALESPERSON table.
7-94
7-94
One-to-One: Option #3
Separate tables for the
SALESPERSON and
OFFICE entities, with
Salesperson Number as a
foreign key in the OFFICE
table.
7-95
7-95
Converting Entities in Binary
Relationships: One-to-Many
7-97
7-97
Converting Entities in Binary
Relationships: Many-to-Many
7-98
7-98
Converting Entities in Binary
Relationships: Many-to-Many
An E-R diagram with two entities in a many-to-
many relationship converts to three relational
tables.
7-100
7-100
Converting Entities in Unary
Relationships: One-to-One
With only one entity type
involved and with a one-to-
one relationship, the
conversion requires only
one table.
7-101
7-101
Converting Entities in Unary
Relationships: One-to-Many
Very similar to the one-
to-one unary case.
7-102
7-102
Converting Entities in Unary
Relationships: Many-to-Many
7-106
7-106
Designing the Good Reading
Bookstores Database
7-107
7-107
Designing the World Music
Association Database
7-108
7-108
Designing the Lucky
Rent-A-Car Database
7-109
7-109
The Data Normalization
Process
A methodology for organizing attributes
into tables so that redundancy among the
nonkey attributes is eliminated.
7-110
7-110
The Data Normalization
Technique
Input:
all the attributes that must be incorporated into the
database
7-111
7-111
Functional Dependence
7-113
7-113
Steps in the Data
Normalization Process
First Normal Form
7-114
7-114
The Data Normalization
Process
Once the attributes are arranged in third normal form,
the group of tables that they comprise is a well-
structured relational database with no data redundancy.
7-115
7-115
General Hardware Company:
Unnormalized Data
7-117
7-117
General Hardware Company:
First Normal Form
7-118
7-118
General Hardware Company:
First Normal Form
First normal form is merely a starting point in the
normalization process.
7-120
7-120
General Hardware Company:
Second Normal Form
7-122
7-122
General Hardware Company:
Third Normal Form
Does not allow transitive dependencies in
which one nonkey attribute is functionally
dependent on another.
7-123
7-123
General Hardware Company:
Third Normal Form
7-124
7-124
General Hardware Company:
Third Normal Form
7-125
7-125
General Hardware Company:
Third Normal Form
Important points about the third normal form
structure are:
7-126
7-126
Candidate Keys as
Determinants
There is one exception to the rule that in third
normal form, nonkey attributes are not allowed
to define other nonkey attributes.
7-128
7-128
General Hardware Company:
First Normal Form
7-129
7-129
Good Reading Bookstores:
Functional Dependencies
7-130
7-130
World Music Association:
Functional Dependencies
7-131
7-131
Lucky Rent-A-Car:
Functional Dependencies
7-132
7-132
Data Normalization Check
The basic idea in checking the structural
worthiness of relational tables, created
through E-R diagram conversion, with the
data normalization rules is to:
Check to see if there are any partial functional
dependencies.
7-134
7-134
Creating a View with SQL
CREATE VIEW EMPLOYEE AS
SELECT SPNUM, SPNAME, YEARHIRE
FROM SLAESPERSON;
7-135
7-135
The SQL Update, Insert, and
Delete Commands
UPDATE SALESPERSON
SET COMMPERCT = 12
WHERE SPNUM = ‘204’;
137
References
138