ADBMS Lec4
ADBMS Lec4
Sunil Paudel
[email protected]
April 16 , 2012 1
Functional Dependencies and
Normalization
2
Functional Dependencies (1)
FDs and keys are used to define normal
forms for relations
FDs are constraints that are derived from
the meaning and interrelationships of the
data attributes
A set of attributes X functionally determines a
set of attributes Y if the value of X determines
a unique value for Y
3
Functional Dependencies (2)
X -> Y holds if whenever two tuples have the same value
for X, they must have the same value for Y
For any two tuples t1 and t2 in any relation instance r(R):
If t1[X]=t2[X], then t1[Y]=t2[Y]
X -> Y in R specifies a constraint on all relation
instances r(R)
FDs are derived from the real-world constraints on the
attributes
4
Examples of FD constraints (1)
social security number determines employee
name
SSN -> ENAME
project number determines project name and
location
PNUMBER -> {PNAME, PLOCATION}
employee ssn and project number determines
the hours per week that the employee works on
the project
{SSN, PNUMBER} -> HOURS
5
Examples of FD constraints (2)
An FD is a property of the attributes in the
schema R
The constraint must hold on every relation
instance r(R)
If K is a key of R, then K functionally determines
all attributes in R (since we never have two
distinct tuples with t1[K]=t2[K])
6
Normal Forms Based on Primary Keys
Normalization of Relations
Practical Use of Normal Forms
Definitions of Keys and Attributes
• Participating in Keys
First Normal Form
Second Normal Form
Third Normal Form
BCNF
4NF and 5NF
7
Normalization of Relations (1)
Normalization: The process of decomposing
unsatisfactory "bad" relations by breaking up
their attributes into smaller relations
8
Normalization of Relations (2)
2NF, 3NF, BCNF based on keys and FDs of a
relation schema
4NF based on keys, multi-valued dependencies :
MVDs;
5NF based on keys, join dependencies : JDs
9
Practical Use of Normal Forms
Normalization is carried out in practice so that the
resulting designs are of high quality and meet the
desirable properties
The practical utility of these normal forms becomes
questionable when the constraints on which they are
based are hard to understand or to detect
The database designers need not normalize to the
highest possible normal form. (usually up to 3NF, BCNF
or 4NF)
10
Normalization of Relations
Primarily a tool to validate and improve a logical
design so that it satisfies certain constraints that
avoid unnecessary duplication of data
The process of decomposing relations with
anomalies to produce smaller, well-structured
relations
11
Example
12
Why not a well-structured relation?
Insertion:
Can we enter a new employee without having the
employee take a class?
Deletion:
What will happen if we remove employee 140
Modification:
Which records do we need to update in order to give
a salary increase to employee 100?
13
Anomalies in this Table
Insertion
can’t enter a new employee without having the
employee take a class
Deletion
if we remove employee 140, we lose information
about the existence of a Tax Acc class
Modification
giving a salary increase to employee 100 forces us
to update multiple records
Why do these anomalies exist?
Because there are two themes (entity types) in
this one relation. This results in data duplication
and an unnecessary dependency between the
entities
14
Well-Structured Relations
16
Practical Use of Normal Forms
Normalization is carried out in practice so that the
resulting designs are of high quality and meet the
desirable properties
Denormalization:
The process of storing the join of higher normal form
relations as a base relation—which is in a lower
normal form
17
Functional Dependencies and Keys
Functional Dependency: The value of
one attribute determines the value of
another attribute
Candidate Key:
A unique identifier. One of the candidate
keys will become the primary key
• E.g. perhaps there is both credit card number
and SS# in a table…in this case both are
candidate keys
Each non-key field is functionally
dependent on every candidate key
18
First Normal Form
19
NOT in 1st normal form ---Table with multivalued attributes
20
1st normal form -- Table with no multivalued attributes and
unique rows
23
Functional dependency diagram for INVOICE
Getting it into
Second Normal
Form
25
Third Normal Form
2NF PLUS no transitive dependencies
(functional dependencies on non-primary-key
attributes)
Solution: Non-key determinant with transitive
dependencies go into a new table; non-key
determinant becomes primary key in the new
table and stays as foreign key in the old table
26
Removing partial dependencies
Getting it into
Third Normal
Form
27
Normal Forms Defined Informally
1st normal form
All attributes depend on the key
2nd normal form
All attributes depend on the whole key
3rd normal form
All attributes depend on nothing but the key
28
Enterprise Keys
Primary keys that are
unique in the whole
database, not just
within a single relation
Corresponds with the
concept of an object
ID in object-oriented
systems
30
Multi-Valued Dependencies
A multi-valued dependency occurs when a
determinant determines more than one
dependent, and the dependents are
independent of each other
Example course implies teacher; course implies
text, where teacher and text are independent
A relation with course, instructor and text is all
key, and exhibits redundancy, but is in 3NF
Updates can exhibit anomalies
31
Fourth Normal Form
Relation R is in 4 NF if and only if,
whenever there exist subsets A and B of
the attributes of R such that the nontrivial
multi-valued dependency A multi-
determines B is satisfied, then all
attributes of R are also functionally
dependent on A
In the previous example, decompose
course,instructor, text into two relation:
course, instructor, and course text
32
Fifth Normal Form
A relation R is in 5NF – also called
projection-join normal form, if and only if
every nontrivial join dependency that is
satisfied by R is implied by the candidate
key(s) of R
In the general case, SPJ is not in 5NF, but
SP, PJ, and JS are in 5NF
5NF is a generalization of 4NF, which is a
generalization of 3NF
It is the most general form possible for
projection-based normalization
33
Denormalization
Denormalization is said to be necessary to
improve performance
Technically normalization is a model
concept, not related to stored files
Most people confuse the two
In practice, denormalization will speed up
some queries, and drag down others
Proceed with caution
34
35