1 - Dbms Module 4 PPT 1
1 - Dbms Module 4 PPT 1
Module 4: Normalization
PREPARED BY SHARIKA T
R, SNGCE
PREPARED BY SHARIKA T R,
SNGCE
SYLLABUS
• Different anomalies in designing a database, The idea of
normalization, Functional dependency, Armstrong’s
Axioms (proofs not required), Closures and their
computation, Equivalence of Functional Dependencies
(FD), Minimal Cover (proofs not required).
• First Normal Form (1NF), Second Normal Form (2NF),
Third Normal Form (3NF), Boysce Codd Normal Form
(BCNF),
• Lossless join and dependency preserving decomposition,
Algorithms for checking Lossless Join (LJ) and
Dependency Preserving (DP) properties.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Guideline 1
• Design a relation schema so that it is easy to explain its meaning.
• Do not combine attributes from multiple entity types and
relationship types into a single relation.
• Intuitively, if a relation schema corresponds to one entity type or
one relationship type, it is straightforward to interpret and to
explain its meaning.
• Otherwise, if the relation corresponds to a mixture of multiple
entities and relationships, semantic ambiguities will result and the
relation cannot be easily explained.
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
ambiquity combine attributes from Employee and Department into single table
this lacks meaning
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Insertion Anomalies
• Consider the relation:
• EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours)
• Insert Anomaly:
▫ Cannot insert a project unless an employee is assigned to it.
• Conversely
▫ Cannot insert an employee unless an he/she is assigned to a
project.
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Deletion Anomalies
• If we delete from EMP_DEPT an employee tuple that happens to
represent the last employee working for a particular department,
the information concerning that department is lost from the
database.
• Consider the relation: EMP_PROJ(Emp#, Proj#, Ename, Pname,
No_hours)
• Delete Anomaly:
▫ When a project is deleted, it will result in deleting all the employees
who work on that project.
▫ Alternately, if an employee is the sole employee on a project, deleting
that employee would result in deleting the corresponding project.
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Modification Anomalies
• EMP_DEPT, if we change the value of one of the attributes
of a particular department say,
▫ the manager of department 5 we must update the tuples of all
employees who work in that department;
▫ otherwise, the database will become inconsistent.
• If we fail to update some tuples, the same department will
be shown to have two different values for manager in
different employee tuples, which would be wrong
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Guideline 2
• Design a schema that does not suffer from the insertion,
deletion and update anomalies.
• If there are any anomalies present, then note them so that
applications can be made to take them into account.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Guideline 3
• Relations should be designed such that their tuples will have as
few NULL values as possible
• Attributes that are NULL frequently could be placed in separate
relations (with the primary key)
• For example, if only 15 percent of employees have individual
offices,
▫ there is little justification for including an attribute
Office_number in the EMPLOYEE relation;
▫ rather, a relation EMP_OFFICES(Essn, Office_number) can
be created to include tuples for only the employees with
individual offices
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
any cost
• Consider the tables
▫ EMP_LOCS(EName, PLocation)
▫ EMP_PROJ1(SSN, PNumber, Hours, PName, PLocation)
• versus the table
▫ EMP_PROJ(SSN, PNumber, Hours, EName, PName, PLocation)
• If we use the former as our base tables then we cannot recover all the
information of the latter because trying to natural join the two tables will
produce many rows not in EMP_PROJ.
• These extra rows are called spurious tuples.
• Another design guideline is that relation schemas should be designed so
that they can be joined with equality conditions on attributes that are either
primary keys or foreign keys in a way such that no spurious tuples are
generated.
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
R1(A)
R(A,B,C) A
R1XR2
a1
A B C A B C
a2
a1 b1 c1
a1 b1 c1
R2(B,C) a2 b1 c1
a2 b1 c1
B C a1 b2 c2
a1 b2 c2 b1 c1 a2 b2 c2
b2 c2
SPURIOUS TUPLE
PREPARED BY SHARIKA T R,
SNGCE
Guideline 4
• Design relation schemas so that they can be joined with
equality conditions on attributes that are appropriately
related (primary key, foreign key) pairs in a way that
guarantees that no spurious tuples are generated.
• Avoid relations that contain matching attributes that are
not (foreign key, primary key) combinations because
joining on such attributes may produce spurious tuples
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Functional dependencies
• Functional Dependencies
▫ Are used to specify formal measures of the "goodness" of relational
designs
▫ And keys are used to define normal forms for relations
▫ Are constraints that are derived from the meaning and
interrelationships of the data attributes
▫ A functional dependency is a constraint between two sets of
attributes from the database.
▫ Suppose that our relational database schema has n attributes A1, A2,
..., An)
A set of attributes X functionally determines a set of attributes Y if
the value of X determines a unique value for Y
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
• X→Y holds if whenever two tuples have the same value for X, they
must have the same value for Y
• For any two tuples t1 and t2 in any relation instance r(R):
▫ If t1[X]=t2[X], then t1[Y]=t2[Y]
• X→Y in R specifies a constraint on all relation instances r(R)
• Written as X→Y; can be displayed graphically on a relation
schema as in Figures. ( denoted by the arrow: ).
• FDs are derived from the real-world constraints on the attributes
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
a2 b3 B → A implies
B functionally determines A
A functionally depends on B
A is functionally determined by B
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Exercise
EMPLOYEE(Eid, Ename, Eage, Dnum)
DEPT(Dno, Dname, Dloc)
Find valid FDs
1. Eid→Ename
2. Ename→Eid
3. Eage→Ename
4. Dno→Dname, Dloc
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
• C → B;
• {A, B} → C;
• {A, B} → D; and
• {C, D} → B.
• However, the following do not hold because we already
have violations of them in the given extension:
• A → B (tuples 1 and 2 violate this constraint);
• B → A (tuples 2 and 3 violate this constraint);
• D → C (tuples 3 and 4 violate it).
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
2. Non Trivial FD
In Non-trivial functional dependency, the dependent is
strictly not a subset of the determinant.
i.e. If X → Y and Y is not a subset of X, then it is called
Non-trivial functional dependency.
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
• R(V,W,X,Y,Z)
• FD { VY→W, WX→Z, ZY→V}
V, W, X, Y, Z
XY+=XY XY must be in CK
VXY+=VXYWZ ✔ CK
WXY+=WXYZV ✔ CK
PREPARED BY SHARIKA T R,
SNGCE
• Axiom of reflexivity
▫ If A is a set of attributes and B is subset of A, then A holds B. If B⊆A
then A→B
▫ This property is trivial property.
• Axiom of augmentation
▫ If A→B holds and Y is attribute set, then AY→BY also holds.
▫ That is adding attributes in dependencies, does not change the basic
dependencies.
▫ If A→B , then AC→BC for any C.
• Axiom of transitivity
▫ Same as the transitive rule in algebra, if A→B holds and B→C holds,
then A→C also holds.
▫ A→B is called as A functionally that determines B. If X→Y and Y→Z
, then X→Z
Secondary Rules
PREPARED BY SHARIKA T R,
SNGCE
• Union
▫ If A→B holds and A→C holds, then A→BC holds.
▫ If X→Y and X→Z then X→YZ
• Composition
▫ If A→B and X→Y holds, then AX→BY holds.
• Decomposition
▫ If A→BC holds then A→B and A→C hold. If X→YZ then
X→Y and X→Z
• Pseudo Transitivity
▫ If A→B holds and BC→D holds, then AC→D holds. If X→Y
and YZ→W then XZ→W.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
Dependencies
Definition.
A set of functional dependencies F is said to cover another set of
functional dependencies E if every FD in E is also in F+; that is, if
every dependency in E can be inferred from F; alternatively, we can
say that E is covered by F.
Definition
Two sets of functional dependencies E and F are equivalent if
E+ = F+. Therefore, equivalence means that every FD in E can be
inferred from F, and every FD in F can be inferred from E; that is, E
is equivalent to F if both the conditions—E covers F and F covers E—
hold.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
• Step 3.
▫ As FD2 ⊃ FD1 and FD1 ⊃ FD2 both are true FD2 =FD1 is
true.
▫ These two FD sets are semantically equivalent.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
{B → A, D → A, AB → D} PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
E : {B → A, D → A, AB → D}
AB → D may
be replaced by
B→A BB→AB B→AB B→D
PREPARED BY SHARIKA T R,
SNGCE
Normalization of Relations
• Normalization:
▫ The process of decomposing unsatisfactory "bad"
relations by breaking up their attributes into smaller
relations
• Normal form:
▫ Condition using keys and FDs of a relation to certify
whether a relation schema is in a particular normal form
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
• Ssn,pnumberhour
• {SSn,pnumber} is a key
• But SSn is not a key
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Candidate key
• If a relation schema has more than one key, each is called a
candidate key.
• One of the candidate keys is arbitrarily designated to be the
primary key, and the others are called secondary keys.
• In a practical relational database, each relation schema
must have a primary key.
• If no candidate key is known for a relation, the entire
relation can be treated as a default superkey.
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Levels of Normalization
• Levels of normalization based on the amount of
redundancy in the database.
• Various levels of normalization are:
▫ First Normal Form (1NF)
▫ Second Normal Form (2NF)
Redundancy
▫ Third Normal Form (3NF)
Number of Tables
Complexity
▫ Boyce-Codd Normal Form (BCNF)
▫ Fourth Normal Form (4NF)
▫ Fifth Normal Form (5NF)
▫ Domain Key Normal Form (DKNF)
Most databases should be 3NF or BCNF in order to avoid the database
anomalies.
PREPARED BY SHARIKA T R,
SNGCE
1NF
2NF
3NF
4NF
5NF
DKNF
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
• SSnEname
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Transitive Dependency
• A functional dependency X → Y in a relation schema R is a
transitive dependency if there exists a set of attributes Z in
R that is neither a candidate key nor a subset of any key of
R and both
• X → Z and Z → Y hold.
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Summary
• 1NF: Ensure Atomicity
• 2NF: Must be in 1NF + Ensure no partial dependency
Proper subset of any key of R → Non Prime attributes
or //BOTH NOT ALLOWED IN 2NF
Prime attribute → Non Prime attributes
• 3NF: Must be in 2NF & No transitive dependency
A relation R is in 3NF if for every FD X → Y
Either X is a SK or
Y is a prime attribute of R
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Example 1
• Given R(A,B,C)
• FD={A→B, B→C, C→A}
• is decomposed to R1(A B) and R2(BC) check wheather it is
lossless join or not
ANS. There is a common attribute in R1 and R2 ie, B. Now
check B is a candidate key for R1 or R2
B+=BCA
ie B can be a candidate key in both R1 and R2 so is Lossless
decomposition
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Example 2
• Given R(A,B,C,D)
• FD={AB→CD, D→A}
• is decomposed to R1(A C) and R2(B C D) check wheather it
is lossless join or not
ANS. There is a common attribute in R1 and R2 ie, C. Now
check C is a candidate key for R1 or R2
C+=C
ie C is not a candidate key in both R1 and R2 so is Lossy
decomposition
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
or not
• Given R(A,B,C,D,E)
• Decomosed to R1(A,B,C), R2(B,C,D), R3(C,D,E)
• FD={AB→CD, A→E, C→D}
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
A B C D E
R1
R2
R2
PREPARED BY SHARIKA T R,
SNGCE
A B C D E
PREPARED BY SHARIKA T R,
SNGCE
A B C D E
R1 a1 a2 a3 b14 b15
R2 b21 a2 a3 a4 b25
R2 b31 b32 a3 a4 a5
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
A B C D E
R1 a1 a2 a3 b14 b15
R2 b21 a2 a3 a4 b25
R2 b31 b32 a3 a4 a5
PREPARED BY SHARIKA T R,
SNGCE
A B C D E
R1 a1 a2 a3 b14 b15
R2 b21 a2 a3 a4 b25
R2 b31 b32 a3 a4 a5
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
A B C D E
R1 a1 a2 a3 b14 a4 b15
R2 b21 a2 a3 a4 b25
R2 b31 b32 a3 a4 a5
PREPARED BY SHARIKA T R,
SNGCE
• Now look for any row with all a values. Here there is no
such row so we can conclude this is a lossy join
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Dependency Preservation
• Let Fi be the set of dependencies F + that include only
attributes in Ri.
A decomposition is dependency preserving, if
(F1 F2 … Fn )+ = F +
If it is not, then checking updates for violation of
functional dependencies may require computing joins,
which is expensive.
• See book for efficient algorithm for checking dependency
preservation
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Example
• R(A,B,C,D)
• FD={A→B,B→C, C→D, D→B}
• Decomposed into R1(AB), R2(B,C) and R3(B,D)
✔Not in
B→A Not in C→B orginal fd so B→D ✔Not in
orginal fd so take orginal fd so
take C+ to check take
B+ to check this holds B+ to check
this holds C+=CDB so this holds
B+=BCD not holding B+=BCD so
holding holding
FD1∪FD2∪FD3
• Now check these dependences are preserved in FD
FD1∪FD2∪FD3 FD={A→B,B→C, C→D, D→B}
{A→B, ✔ in orginal FD
B→C, ✔ in orginal FD
✔ not in orginal FD. Take C+ corresponds
to FD1∪FD2∪FD3. C+=CBD. B is covered
C→B, here
✔ not in orginal FD. Take B+
B→D, corresponds to FD1∪FD2∪FD3. So all FD is
B+=BCD. D is covered here preserved in
FD1∪FD2∪FD3
D→B} ✔ in orginal FD
PREPARED BY SHARIKA T R,
SNGCE
References
• Elmasri R. and S. Navathe, Database Systems: Models,
Languages, Design and Application Programming,
Pearson Education, 2013.
• https://www.geeksforgeeks.org/armstrongs-axioms-in-
functional-dependency-in-dbms/