0% found this document useful (0 votes)
42 views39 pages

Normalization 2

The document discusses normalization forms up to 4NF. It summarizes the key properties of 1NF, 2NF, 3NF as requiring attributes to be atomic and dependent on the primary key. Higher normal forms reduce redundancy and anomalies. Multi-value dependencies (MVDs) are introduced where one attribute determines a set of values for another, rather than a single value. A relation is in 4NF if all non-trivial MVDs have a superkey determinant. MVDs allow lossless decomposition into relations where the join of the relations equals the original.

Uploaded by

yashbhardwaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views39 pages

Normalization 2

The document discusses normalization forms up to 4NF. It summarizes the key properties of 1NF, 2NF, 3NF as requiring attributes to be atomic and dependent on the primary key. Higher normal forms reduce redundancy and anomalies. Multi-value dependencies (MVDs) are introduced where one attribute determines a set of values for another, rather than a single value. A relation is in 4NF if all non-trivial MVDs have a superkey determinant. MVDs allow lossless decomposition into relations where the join of the relations equals the original.

Uploaded by

yashbhardwaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Normalization (contd..

P. M. Jat

Reference Text
Fundamentals of Database Systems, 4th ed.
Ramez Elmasri and Shamkant B. Navathe
Chapter-11
Summarize: 1NF, 2NF, 3NF
based on Primary Key
 1NF
 No repeating columns, all attributes are atomic
 2NF
 Is in 1NF, and
 All non-prime attributes are fully or irreducibly
dependent on primary key
 3NF
 Is in 2NF, and
 All non-prime attributes are fully or irreducibly
dependent on primary key – should not be
transitively dependent.

November 4, 2008 Database Systems 2


“Goodness” of Relation

 Higher the normal form, better it is,

 That is less of redundancies and anomalies, but still


there might be.

 In practice, a relation should be at-least in 3rd normal


form

November 4, 2008 Database Systems 3


November 4, 2008 Database Systems 4
November 4, 2008 Database Systems 5
November 4, 2008 Database Systems 6
Which Normal form it is?

 Not in 2NF. The relation has two keys, and FD3 is not
allowed in 2NF

November 4, 2008 Database Systems 7


Normalized to 2NF

 Is it in 3NF?

 Not because of FD4, Transitive dependency!

November 4, 2008 Database Systems 8


Normalized to 3NF

November 4, 2008 Database Systems 9


Is it in BCNF?

 Not because of FD5, because Area is not a SuperKey


 How do we decompose?

November 4, 2008 Database Systems 10


Normalized to BCNF

 FD2 is lost in this decomposition?


November 4, 2008 Database Systems 11
BCNF
 A typical schematic diagram of a relation being in
3NF but not in BCNF..

November 4, 2008 Database Systems 12


November 4, 2008 Database Systems 13
November 4, 2008 Database Systems 14
Example-1
 Consider relation (S#, SName, Status, City),
where S# and SName are candidate keys, and
have got following FDs
1. S#  {STATUS, CITY}
2. SNAME  {STATUS, CITY}
3. S#  SNAME
4. SNAME  S#
 In which normal form the relation is?
 2NF: Yes/No? YES
 3NF: Yes/No? YES
 BCNF: Yes/No? YES
November 4, 2008 Database Systems 15
Example-2
 Consider relation (S#, SName, P#, QTY), and
SName is unique. Candidates keys are {S#,
P#} and {SNAME, P#}. Following FDs exist-
1. {S#,P#}  QTY
2. {SNAME, P#}  QTY
3. S#  SNAME
4. SNAME  S#
 In which normal form the relation is?
 2NF: Yes/No? YES
 3NF: Yes/No? YES
 BCNF: Yes/No? No, Because of FDs 3 and 4
November 4, 2008 Database Systems 16
Example-2 contd..
 Obviously there are redundancy and anomalies
in relation (S#, SName, P#, QTY), to make it
in BCNF, it can be decomposed to
1. (S#, SNAME), where both are candidate keys
2. (S#, P#, QTY), where (S#, P#) is candidate key
OR
1. (S#, SNAME), where both are candidate keys
2. (SNAME, P#, QTY), where (SNAME, P#) is
candidate key

November 4, 2008 Database Systems 17


Example-3
 Consider a real-word situation, in which there are
Instructors, Courses, and Students, and apply
following constraints -
 Each Course to a student is taught by only one teacher.
 Each Instructor teaches only one course, but each course can
be taught by multiple instructors.
 What are the FDs?
1. {Course, Student}  Instructor
2. Instructor  Course
 What is candidate keys of relation
R(Student, Course, Instructor)
 {Course, Student}

November 4, 2008 Database Systems 18


Example-3 contd..
 In which NF the relation R (Student, Course,
Instructor) is-
 3NF: YES
 BCNF: No, Because of FD 2
 Decomposition is not straight forward. Whatever
decomposition you make, you loose FD1
 You can not bring it into BCNF.
 Learning is you may not always able to achieve BCNF.

November 4, 2008 Database Systems 19


Properties of Relational Decomposition

 Attribute preservation

 Dependency Preservation

 Lossless decomposition

November 4, 2008 Database Systems 20


Lossless decomposition
 If a relation R is decomposed in two relations R1 and
R2, their natural join must produce exactly same
relation R
 Then it is called lossless decomposition.
 The world lossless here refers loss of information,
 Normally lossy decomposition produces extra tuples
on natural join, that is lossless decomposition is also
called non-additive decomposition
 The property: A decomposition is lossless if
attributes common in R1 and R2 is key of one
of the relation.

November 4, 2008 Database Systems 21


Look at the situation below
Hierarchical Data
Constraints -
Course Teacher Text Book  Can be any number of teachers
DBMS PM Jat Navathe for a course
Sanjeev Date  Can be any number of texts for
Korth a course
DS PM Jat Sahni  No matter which teacher takes
the course, same set of books
Aho
are used
 Let us also assume that a text
can be used in different courses

November 4, 2008 Database Systems 22


Relational representation of the situation
Course Teacher Text Book
DBMS PM Jat Navathe
DBMS PM Jat Date
DBMS PM Jat Korth
DBMS Sanjeev Navathe
DBMS Sanjeev Date
DBMS Sanjeev Korth
Data Structure PM Jat Sahni
Data Structure PM Jat Aho
In which normal form the relation is?
Because
What if we it need
is alltokey
add onerelation,
more All Key for
teacher relation
the
necessarily
course of need
November 4, 2008
“Datato beDatabase
in BCNF
Structure” ?
Systems 23
Multi-value dependency
 Note that a course has a set of corresponding teachers

 More precisely, for a given pair of (course and text)


there is set of teachers. It does not make a difference
that which text we choose.

 Similarly, for a pair of course and teacher, there is same


set of texts, no matter what value of teacher you take

 Here is a new type of dependency exists in the relation


is MVD
November 4, 2008 Database Systems 24
Multi-value dependencies
 In this Teacher is multi-dependent on Course, and,
Text is multi-dependent on Course, therefore we say
 Course -->> Teacher
 Course -->> Text Book
 However as we have seen MVDs always go in pair, any
pair of A, C will have a set of values of B, and that set
will have same set for every distinct value of C. And as
you see this set is determined only by A.
 For this reason, it is common to represent them both in
a single statement-
 A -->> B | C
 Course -->> Teacher | Text Book

November 4, 2008 Database Systems 25


MVD
 Formally MVD can be defined like this
 Whenever two tuples exists that have distinct values
of B but same value of A, these values of B must be
repeated in separate tuples for every distinct value
of C that occurs with the same value of A
 For example
 If, tuples (a1,b1,c1), (a1,b2,c2) both exists
 Then tuples (a1,b1,c2), (a1,b2,c1) must also exist
 A MVD A -->> B is called trivial when
 A U B = R (means No C in A -->> B | C), or
 B is subset of A
November 4, 2008 Database Systems 26
MVD and Decomposition
 A relation R = {A,B,C} can have lossless
decomposition R1={A,B} and R2={A,C}, only if
A -->> B | C

Course Teacher Course Text Book


DBMS PM Jat DBMS Navathe
DBMS Sanjeev DBMS Date
DS PM Jat DBMS Korth
DS Sahni
DS Aho

November 4, 2008 Database Systems 27


4NF
 4NF based on MVDs
A relation is in 4NF if and only if, when for every non
trivial MVD A -->> B that holds over R, A is super key

 In other words, we say that either MVD or FD which can


exist in 4NF are determined by Key

 In our CTB example, we can have


 R1(Course, Teacher), and R2(Course, Text Book)
 Both are 4NF because now MVDs in both relations are
trivial

November 4, 2008 Database Systems 28


MVD

 Whenever we decompose a relation having MVD,


A -->> B, such that R1={AUB} and
R2 = {R-B}, then it is lossless decomposition

November 4, 2008 Database Systems 29


FD is special case of MVD
 We have seen as condition for MVD, that a pair of A
and C attributes produces same set of B,

 If B is a single value rather than a set than it is FD,


otherwise it is MVD

 MVD is present when you have situation


 One value of a attributes determines multiple values
for another set(s) of attributes

November 4, 2008 Database Systems 30


Join Dependencies
 If you can have lossless decomposition of a relation R
into R1 and R2, then relation R said to have a join
dependency, represented as follows-
*{R1,R2}

 You could decompose (lossless) relation


EMP_PROJ(SSN, ENAME, PNO, HOURS) into
EMP(SSN, ENAME) and PROJ(SSN, PNO, HOURS),
because it has a join dependency
*{EMP(SSN, ENAME), PROJ(SSN, PNO, HOURS)}

 You can have lossless decomposition of a relation only


if it have join dependency

November 4, 2008 Database Systems 31


Join Dependencies
 In general,
 If a relation R is decomposed into
Relation R1, R2, .. Rn, and
if join of R1, R2, .. Rn is equal to R,

then we say that following JD exists


*{R1, R2, … Rn}

 And then we can have lossless decomposition of R into


R1, R2, … Rn.

November 4, 2008 Database Systems 32


JD, MVD, FD
 MVD is a special case JD, as
 Relation R={A,B,C}, has JD *{AB, AC},
only when A -->> B|C, or we can say
 A -->> B | C that implies JD * {AB, AC}

 FD is special case of MVD, as we have seen,


if A -->> B,
does not have multiple values then it is FD

November 4, 2008 Database Systems 33


5NF
 A relation R is in 5NF- also called
Projection Join Normal Form (PJ/NF)
 When every non trivial join dependency JD (R1,
R2, … Rn), every Ri is super key of R
 Suppose you have relation
 EMP(SSN, ENAME, BDATE, SUPERSSN,
SALARY),
has a join dependency
*{EMP1(SSN, ENAME, BDATE),
EMP2(SSN, SUPERSSN, SALARY)}
 every relation forms a super-key
*JD is trivial if one of relation Ri in JD(R1, R2, .. Rn) is equal to R
November 4, 2008 Database Systems 34
A peculiar JD
 3-Decomposable relation SPJ
S# P# J#
S1 P1 J2
S1 P2 J1
S2 P1 J1
S1 P1 J1

S# P# P# J# S# J#
S1 P1 P1 J2 S1 J2
S1 P2 P2 J1 S1 J1
S2 P1 P1 J1 S2 J1

November 4, 2008 Database Systems 35


 SP and PJ are joined over P#
S# P# J#
 Then SJ is joined
S1 P1 J2
over J#S#
S1 P2 J1
S2 P1 J1 S# P# J#
S2 P1 J2 S1 P1 J2
S1 P1 J1 S1 P2 J1
S2 P1 J1
S1 P1 J1

November 4, 2008 Database Systems 36


 This is not MVD because
 There is not certain set of parts from a supplier to every
project
 In our CTB example, If a teacher takes a course, he/she must
refer all text books
 Here there can be any kind of combination, however
there is a special kind of constraint we have that is if
 a Supplier supplies S1 supplies part P1,
 Part P1 is used in Project J1
 S1 supplies to project J1
 Then it is also true that S1 supplies P1 to project J1

November 4, 2008 Database Systems 37


What is the problem in relation SPJ?

 If tuples (s1, p1, j2), (s2, p1,j1), (s1, p2,j1) appear in


SPJ, then tuple (s1,p1,j1) must also exist.

 Ensuring this kind of constraint is difficult, and hence


detecting JDs

November 4, 2008 Database Systems 38


Thanks

November 4, 2008 Database Systems 39

You might also like