Dbms Unit III Normalforms
Dbms Unit III Normalforms
SCHEMA REFINEMENT
The Schema Refinement refers to refine the schema by using some technique. The best
technique of schema refinement is decomposition. Normalisation or Schema Refinement is a
technique of organizing the data in the database. It is a systematic approach of decomposing
tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and
Deletion Anomalies. Redundancy refers to repetition of same data or duplicate copies of
same data stored in different locations.
Anomalies: Anomalies refers to the problems occurred after poorly planned and normalised
databases where all the data is stored in one table which is sometimes called a flat file
database.
Problems Caused by Redundancy: Anomalies refers to the problems occurred after poorly
planned and unnormalised databases where all the data is stored in one table which is
sometimes called a flat file database. Let us consider such type of schema.
Storing the same information redundantly, that is, in more than one place within a database,
can lead to several problems:
Problem in updation / updation anomaly If there is updation in the fee from 5000 to 7000,
then we have to update FEE column in all the rows, else data will become inconsistent
Insertion Anomaly and Deletion Anomaly- These anomalies exist only due to redundancy,
otherwise they do not exist.
InsertionAnomalies: New course is introduced C4, But no student is there who is having C4
Subject
To Avoid Redundancy and problems due to redundancy, we use refinement technique called
Decomposition.
Decomposition: Process of decomposing a larger relation into smaller relations. Each of
smaller relations contain subset of attributes of original relation
Armstrong’s Axioms: Armstrong axioms defines the set of rules for reasoning about
functional dependencies and also to infer all the functional dependencies on a relational
database. Various axioms rules or inference rules:
Here, roll_no → name is a non-trivial functional dependency, since the dependent name is not a
subset of determinant roll_no
Similarly, {roll_no, name} → age is also a non-trivial functional dependency, since age is not a
subset of {roll_no, name}
2. Trivial Functional Dependency:If A → B has trivial functional dependency if B is a subset of
A. In Trivial Functional Dependency, a dependent is always a subset of the determinant i.e. If X
→ Y and Y is the subset of X, then it is called trivial functional dependency
The following dependencies are also trivial like: A → A, B → B
AB→AB. For example,
roll_no name age
42 smith 17
43 john 18
44 peter 18
Programming Java, C
Web designing PHP, HTML
The above table consist of multiple values in single columns which can be reduced into
atomic values by using first normal form as follows
Domain Courses
Programming Java
Programming C
Web designing PHP
Web designing HTML
101 smith P1 x
102 john P2 y
Here (student id, project id) are key attributes and (student name, project name) are non-prime
attributes. It is decomposed as
Student id Student name Project id
101 smith P1
102 john P2
P1 x
P2 y
Third Normal Form:
A relation is said to be in third normal form , if it is already in second normal form and no
transitive dependencies exists.
3NF is used to reduce the data duplication. It is also used to achieve the data integrity. If there is
no transitive dependency for non-prime attributes, then the relation must be in third normal form.
Update anomaly is caused by a transitive dependency.
A relation is in third normal form, if there is no transitive dependency for non-prime attributes as
well as it is in second normal form.
Prepared By D Madhu BaBu
Let R be a relation schema, X be a subset of the attributes of R, and A be an attribute of R. R is in
third normal form if for every FD X → A that holds over R, one of the following statements is
true:
•A ∈ X; that is, it is a trivial FD, or
•X is a super key, or
OR
A relation is in 3NF if at least one of the following condition holds in every non-trivial function
dependency X –> Y
1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part of some candidate key.
Transitive Functional dependency (TFD): If AB and BC are two FDs then AC is called
transitive dependency.
Transitive Dependencies
Suppose that a dependency X → A causes a violation of 3NF. There are two cases:
1.X is a proper subset of some key K. Such a dependency is sometimes called a partial
dependency.
2.X is not a proper subset of any key. Such a dependency is sometimes called a transitive
dependency because it means we have a chain of dependencies K → X → A. The problem is that
we cannot associate an X value with a K value unless we also associate an A value with an X
value.
OR
a relation is in BCNF every determinant is a candidate key. if non-prime attribute determine the
one or more prime attribute then the relation not in BCNF.
Advidor Major
faculty1 physics
faculty2 music
faculty3 Bio
faculty4 Physics
******************************************************************************
Example: convert the following relation into 1NF,2NF,3NF
CID Ctitle
CIDIname
CIDIloc
InameIloc is transitive dependency hence course relation not in 3NF But it is in 2NF
Registration relation
SID CID Grade
111 IS11 A
111 IS22 B
222 IS11 C
222 IT22 B
222 IT33 A
For example, the relation EMP_PROJECTS in Figure has the trivial MVD
. An MVD that satisfies neither (a) nor (b) is called a nontrivial MVD.
Definition. A relation schema R is in 4NF with respect to a set of dependencies F (that includes
functional dependencies and multivalued dependencies) if, for every nontrivial multivalued
dependency in F+ X is a superkey for R
The process of normalizing a relation involving the nontrivial MVDs that is not in 4NF consists of
decomposing it so that each MVD is represented by a separate relation where it becomes a trivial
MVD. Consider the EMP relation in Figure EMP is not in 4NF because in the nontrivial MVDs
and , and Ename is not a superkey of EMP.
Fig:Decomposing the EMP relation into two 4NF relations EMP_PROJECTS and
EMP_DEPENDENTS.We decompose EMP into EMP_PROJECTS and EMP_DEPENDENTS,
shown in above Figure. Both EMP_PROJECTS and EMP_DEPENDENTS are in 4NF, because
Rules of decomposition:
If ‘R’ is a relation splitted into ‘R1’ and ‘R2’ relations, the decomposition done should satisfy
following
R1(attributes)UR2(attributes)=R(attributes)
R1(attributes)∩R2(attributes)!=null
R1(attribute)∩R2(attribute)=R(key attribute)
Properties of decomposition:
Lossless decomposition: while joining two smaller tables no data should be lost and should satisfy
all the rules of decomposition. No additional data should be generated on natural join of
decomposed tables.
No information is lost from the original relation during decomposition.When the sub relations are
joined back, the same relation is obtained that was decomposed.Every decomposition must always
be lossless.
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R
A B C
1 2 1
2 5 3
3 3 3
R( A , B , C )
Consider this relation is decomposed into two sub relations R1( A , B ) and R2( B , C )
A B
1 2
2 5
3 3
R1( A , B )
2 1
5 3
3 3
R2( B , C )
Now, let us check whether this decomposition is lossless or not.For lossless decomposition,
we must have
R1 ⋈ R2 = R
Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 , we get-
A B C
1 2 1
2 5 3
3 3 3
This relation is same as the original relation R.Thus, we conclude that the above decomposition is
lossless join decomposition.
NOTE: Lossless join decomposition is also known as non-additive join decomposition. This is
because the resultant relation after joining the sub relations is same as the decomposed relation.
No extraneous tuples appear after joining of the sub-relations.
A B C
1 2 1
2 5 3
3 3 3
R( A , B , C )
Consider this relation is decomposed into two sub relations as R1( A , C ) and R2( B , C )
A C
1 1
2 3
3 3
R1( A , B )
B C
2 1
5 3
3 3
R2( B , C )
Now, let us check whether this decomposition is lossy or not.For lossy decomposition, we must
have R1 ⋈ R2 ⊃ R Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 we
get
1 2 1
2 5 3
2 3 3
3 5 3
3 3 3
This relation is not same as the original relation R and contains some extraneous tuples.Clearly,
functional dependencies should be satisfied even after splitting relations and they should be
satisfied by any of splitted tables.Dependency Preservation A Decomposition D = { R1, R2,
R3….Rn } of R is dependency preserving wrt a set F of Functional dependency if (F1 ∪ F2 ∪ …
∪ Fm)+ = F+.
Consider a relation R R ---> F{...with some functional dependency(FD)....} R is decomposed or
divided into R1 with FD { f1 } and R2 with { f2 }, then there can be three cases:
******************************************************************************
Prepared By D Madhu BaBu