UNIT3 Functional Dependency and Normalization
UNIT3 Functional Dependency and Normalization
Functional Dependency
And
Normalization
Unit 3.1
Functional Dependency
Introduction
• ensure that changes made to the database by
authorized users do not result in a loss of data
consistency.
• integrity constraints guard against accidental damage
to the database.
• seen two forms of integrity constraints for the E-R
model
• Key declarations - certain attributes form a
candidate key for a given entity set.
• Form of a relationship - many to many, one to
many, one to one.
• integrity constraints with minimal overhead
• Domain constraints
• Referential Integrity
• Functional Dependency
• Normalization
Domain Constraints
• domain of possible values must be associated with
every attribute
• standard domain types, such as integer types,
character types, and date/time types defined in SQL
• Declaring an attribute to be of a particular domain
acts as a constraint on the values that it can take
• Domain constraints are the most elementary form of
integrity constraint.
• They are tested easily by the system whenever a
new data item is entered into the database
• for several attributes may have the same domain.
• For example, the attributes customer-name and
employee-name might have the same domain:
i.e. the set of all person names
• whether customer-name and branch-name should
have the same domain.
• At the implementation level, both customer names
and branch names are character strings.
• at the conceptual, rather than the physical level,
customer-name and branch-name should have
distinct domains
• Equivalent Statement in Oracle
• CREATE TABLE gender_domain
• (gender VARCHAR2(1) PRIMARY KEY,
CONSTRAINT ch_gen CHECK (gender IN
('M', 'F')));
Referential Integrity
• But In alternate design, more tuple gets changed & its costly
Pitfalls in Relational-Database Design
Pitfalls in Relational-Database Design
if “K → all attributes of R”
• On Customer-schema:
• customer-name→ customer-city
• customer-name→ customer-street
• On Loan-schema:
• loan-number → amount
• loan-number → branch-name
• On Borrower-schema:
• No functional dependencies
• On Account-schema:
• account-number → branch-name
• account-number → balance
• On Depositor-schema:
• No functional dependencies
Types of functional dependencies:
• Full Functional dependency
• Partial Functional dependency
• Transitive dependency
Closure of a Set of
Functional Dependencies
• The set of all FDs that are implied by a given set S of
FDs
• is called the closure of S, denoted by S+
• It is not sufficient to consider the given set of
functional dependencies.
• We need to consider all functional dependencies that
hold
• given a relation schema R = (A, B, C, G, H, I)
• and the set of functional dependencies
• A→B
• A→C
• CG→ H
• CG→ I
• B→H
• The functional dependency A→ H is logically implied.
• Suppose
• t1 and t2 are tuples such that
• t1[A] = t2[A]
• Since we are given that A→B,
• it follows from the definition of functional dependency
that
• t1[B] = t2[B]
• Then, since we are given that B → H,
• it follows from the definition of functional dependency
that
• t1[H] = t2[H]
• Let F be a set of functional dependencies.
• The closure of F, denoted by F+,
• is the set of all functional dependencies logically
implied by F.
• Given F, we can compute F+ directly from the formal
definition of functional dependency.
• If F is large, this process would be lengthy and
difficult.
Armstrong’s Axioms (Rules of inference)
• Axioms, or rules of inference, provide a simpler
technique for reasoning about functional
dependencies.
2. Augmentation rule.
• If α → β holds and γ is a set of attributes, then γα →
γβ holds.
3. Transitivity rule.
• If α →β holds and β → γ holds, then α → γ holds.
• Armstrong’s axioms are sound,
• because they do not generate any incorrect
functional dependencies.
• They are complete, because, for a given set F of
functional dependencies, they allow us to generate
all F+.
• Although Armstrong’s axioms are complete,
• it is tiresome to use them directly for the computation
of F+.
• To simplify matters further, we list additional rules.
• Union rule.
• If α → β holds and α → γ holds, then α →βγ holds.
• Decomposition rule.
• If α →βγ holds, then α → β holds and α →γ holds.
• Pseudotransitivity rule.
• If α→β holds and γβ →δ holds, then αγ →δ holds.
1. Reflexivity: if B is a subset of A, then A → B.
2. Augmentation: if A → B then AC → BC
3. Transitivity: it A → B and B → C then A → C.
4. Self – determination: A → A.
5. Decomposition: If A → BC, then A→B,A→C.
6. Union: it A→ B and A→ C, then A → BC
7. Composition: if A → B, C → D then AC → BD.
8. If A → B and C → D, then All (C – B) → BD
• Let us apply our rules to the example of schema
• R = (A, B, C, G, H, I) and
• the set F of functional dependencies
• {A → B, A → C, CG → H, CG → I, B → H}.
• We list several members of F+ here:
Check A is extraneous in AB → C?
compute (AB – A)+ under F
if contains C then extraneous
Check B is extraneous in AB → C?
compute (AB – B)+ under F
if contains C then extraneous
Normalization
Why Database Normalization Matters
Equation: R(A1, A2, ..., An) is in 1NF if, for each non-key
attribute Ai, the values are atomic.
Breaking It Down
Understanding Atomic Values
Table: “Students”
Equation: R(A1, A2, ..., An) is in 2NF if, for each non-key
attribute Ai, it is fully functionally dependent on the entire
primary key.
Table: “Orders”
After 2NF
Why 2NF Matters
5NF is satisfied when all the tables are broken into as many tables
as possible in order to avoid redundancy.
• In the above table, John takes both Computer and Math class for
Semester 1 but he doesn't take Math class for Semester 2.
• In this case, combination of all these fields required to identify a
valid data.
•
5NF
Suppose we add a new Semester as Semester 3 but do not know
about the subject and who will be taking that subject so we leave
Lecturer and Subject as NULL.