0% found this document useful (0 votes)
17 views35 pages

ADBMS Lec4

Uploaded by

paribesh Karki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views35 pages

ADBMS Lec4

Uploaded by

paribesh Karki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 35

Advanced Data Based Management System

Sunil Paudel
[email protected]

April 16 , 2012 1
Functional Dependencies and
Normalization

2
Functional Dependencies (1)
 FDs and keys are used to define normal
forms for relations
 FDs are constraints that are derived from
the meaning and interrelationships of the
data attributes
 A set of attributes X functionally determines a
set of attributes Y if the value of X determines
a unique value for Y

3
Functional Dependencies (2)
 X -> Y holds if whenever two tuples have the same value
for X, they must have the same value for Y
 For any two tuples t1 and t2 in any relation instance r(R):
If t1[X]=t2[X], then t1[Y]=t2[Y]
 X -> Y in R specifies a constraint on all relation
instances r(R)
 FDs are derived from the real-world constraints on the
attributes

4
Examples of FD constraints (1)
 social security number determines employee
name
SSN -> ENAME
 project number determines project name and
location
PNUMBER -> {PNAME, PLOCATION}
 employee ssn and project number determines
the hours per week that the employee works on
the project
{SSN, PNUMBER} -> HOURS

5
Examples of FD constraints (2)
 An FD is a property of the attributes in the
schema R
 The constraint must hold on every relation
instance r(R)
 If K is a key of R, then K functionally determines
all attributes in R (since we never have two
distinct tuples with t1[K]=t2[K])

6
Normal Forms Based on Primary Keys
 Normalization of Relations
 Practical Use of Normal Forms
 Definitions of Keys and Attributes
• Participating in Keys
 First Normal Form
 Second Normal Form
 Third Normal Form
 BCNF
 4NF and 5NF

7
Normalization of Relations (1)
 Normalization: The process of decomposing
unsatisfactory "bad" relations by breaking up
their attributes into smaller relations

 Normal form: Condition using keys and FDs of


a relation to certify whether a relation schema is
in a particular normal form

8
Normalization of Relations (2)
 2NF, 3NF, BCNF based on keys and FDs of a
relation schema
 4NF based on keys, multi-valued dependencies :
MVDs;
 5NF based on keys, join dependencies : JDs

9
Practical Use of Normal Forms
 Normalization is carried out in practice so that the
resulting designs are of high quality and meet the
desirable properties
 The practical utility of these normal forms becomes
questionable when the constraints on which they are
based are hard to understand or to detect
 The database designers need not normalize to the
highest possible normal form. (usually up to 3NF, BCNF
or 4NF)

10
Normalization of Relations
 Primarily a tool to validate and improve a logical
design so that it satisfies certain constraints that
avoid unnecessary duplication of data
 The process of decomposing relations with
anomalies to produce smaller, well-structured
relations

11
Example

Question–Is this a relation? Answer–Yes: Unique rows and no


multivalued attributes. However, it is not
a well-structured relation.

12
Why not a well-structured relation?
 Insertion:
 Can we enter a new employee without having the
employee take a class?
 Deletion:
 What will happen if we remove employee 140
 Modification:
 Which records do we need to update in order to give
a salary increase to employee 100?

13
Anomalies in this Table
 Insertion
 can’t enter a new employee without having the
employee take a class
 Deletion
 if we remove employee 140, we lose information
about the existence of a Tax Acc class
 Modification
 giving a salary increase to employee 100 forces us
to update multiple records
Why do these anomalies exist?
Because there are two themes (entity types) in
this one relation. This results in data duplication
and an unnecessary dependency between the
entities
14
Well-Structured Relations

 A relation that contains minimal data redundancy and


allows users to insert, delete, and update rows
without causing data inconsistencies
 Goal is to avoid anomalies
 Insertion Anomaly–adding new rows forces user to create
duplicate data
 Deletion Anomaly–deleting rows may cause a loss of data
that would be needed for other future rows
 Modification Anomaly–changing data in a row forces
changes to other rows because of duplication

General rule of thumb: A table should not pertain to more


than one entity type
15
Steps in normalization

16
Practical Use of Normal Forms
 Normalization is carried out in practice so that the
resulting designs are of high quality and meet the
desirable properties

 The database designers need not normalize to the


highest possible normal form
 (usually up to 3NF, BCNF or 4NF)

 Denormalization:
 The process of storing the join of higher normal form
relations as a base relation—which is in a lower
normal form

17
Functional Dependencies and Keys
Functional Dependency: The value of
one attribute determines the value of
another attribute
Candidate Key:
 A unique identifier. One of the candidate
keys will become the primary key
• E.g. perhaps there is both credit card number
and SS# in a table…in this case both are
candidate keys
 Each non-key field is functionally
dependent on every candidate key

18
First Normal Form

No multivalued attributes

Every attribute value is atomic

19
NOT in 1st normal form ---Table with multivalued attributes

20
1st normal form -- Table with no multivalued attributes and
unique rows

Note: this is relation, but not a well-structured one 21


Anomalies in this Table
 Insertion–if new product is ordered for order
1007 of existing customer, customer data must
be re-entered, causing duplication
 Deletion–if we delete the Dining Table from
Order 1006, we lose information concerning this
item's finish and price
 Update–changing the price of product ID 4
requires update in several records
Why do these anomalies exist?
Because there are multiple themes (entity types) in
one relation. This results in duplication and an
unnecessary dependency between the entities
22
Second Normal Form
 1NF PLUS every non-key attribute is fully
functionally dependent on the ENTIRE primary
key
 Every non-key attribute must be defined by the
entire key, not by only part of the key
 No partial functional dependencies

23
Functional dependency diagram for INVOICE

Order_ID  Order_Date, Customer_ID, Customer_Name, Customer_Address


Customer_ID  Customer_Name, Customer_Address
Product_ID  Product_Description, Product_Finish, Unit_Price
Order_ID, Product_ID  Order_Quantity

Therefore, NOT in 2nd Normal Form


24
Removing partial dependencies

Getting it into
Second Normal
Form

Partial dependencies are removed, but there


are still transitive dependencies

25
Third Normal Form
 2NF PLUS no transitive dependencies
(functional dependencies on non-primary-key
attributes)
 Solution: Non-key determinant with transitive
dependencies go into a new table; non-key
determinant becomes primary key in the new
table and stays as foreign key in the old table

26
Removing partial dependencies

Getting it into
Third Normal
Form

Transitive dependencies are removed

27
Normal Forms Defined Informally
1st normal form
 All attributes depend on the key
2nd normal form
 All attributes depend on the whole key
3rd normal form
 All attributes depend on nothing but the key

28
Enterprise Keys
 Primary keys that are
unique in the whole
database, not just
within a single relation
 Corresponds with the
concept of an object
ID in object-oriented
systems

b) Sample data with a) Relations with


enterprise key enterprise key 29
Boyce/Codd Normal Form
 BCNF refers to decompositions involving
Relations with more than one candidate key,
where the candidate keys are composite and
overlapping
 There must be no non-trivial functional
dependencies of attributes on something other
than a superset of a candidate key (called a
superkey).
 That is, a relation is in BCNF if and only if every
determinant is a candidate key

30
Multi-Valued Dependencies
 A multi-valued dependency occurs when a
determinant determines more than one
dependent, and the dependents are
independent of each other
 Example course implies teacher; course implies
text, where teacher and text are independent
 A relation with course, instructor and text is all
key, and exhibits redundancy, but is in 3NF
 Updates can exhibit anomalies

31
Fourth Normal Form
Relation R is in 4 NF if and only if,
whenever there exist subsets A and B of
the attributes of R such that the nontrivial
multi-valued dependency A multi-
determines B is satisfied, then all
attributes of R are also functionally
dependent on A
In the previous example, decompose
course,instructor, text into two relation:
course, instructor, and course text

32
Fifth Normal Form
A relation R is in 5NF – also called
projection-join normal form, if and only if
every nontrivial join dependency that is
satisfied by R is implied by the candidate
key(s) of R
In the general case, SPJ is not in 5NF, but
SP, PJ, and JS are in 5NF
5NF is a generalization of 4NF, which is a
generalization of 3NF
It is the most general form possible for
projection-based normalization
33
Denormalization
Denormalization is said to be necessary to
improve performance
Technically normalization is a model
concept, not related to stored files
Most people confuse the two
In practice, denormalization will speed up
some queries, and drag down others
Proceed with caution

34
35

You might also like