Dbms Chapter 4
Dbms Chapter 4
Unit-4
Constraints and Normalization
Student
RollNo Name SPI BL
101 Raju 8 0
102 ram 7 1
103 Jay 7 0
X Y X1 X2 Y X Y1 Y2
Example
Consider the relation Account(account_no, balance, branch).
account_no can determine balance and branch.
So, there is a functional dependency from account_no to balance and branch.
This can be denoted by account_no → {balance, branch}.
Sub_Fac
Subject Faculty Age
DS Shah 35
DBMS singh 32
DF Shah 35
FD : x y
If tuple1.x = tuple2.x then tuple1.y = tuple2.y
transitivity : If (X Y and Y Z ) then X Z name marks, marksdept then name dept
Augmentation: If X Y then XA YA roll_no marks then roll_no, name marks, name
Union : if X Y and X Z then X YZ roll_no marks ,roll_no name then roll_no marks, name
Decomposition/splitting : if X YZ then X Y and X Z name (marks ,dept) then name dept, name marks
If B is a subset of A If A → B If A → A
then A → B then AC → BC
If A → B and B → C If A → B and BD → C If A → BC
then A → B and A →
then A → C then AD → C C
Union Composition
If A → B and A → C If A → B and C → D
then A → BC then AC → BD
DD
Attribute Closure/Closure set DE
R(A,B,C,D,E) DDE
DABC EE
Attribute closure of certain attribute is a set FD: {AB, BC, CD, DE} EABCD
of all attribute that can be derived by that CC
attribute. AA BB CD
AB BC CE
AC BD CCDE
AD BE CAB
AE BBCDE
AABCD BA
A+ = {A, B, C, D, E} E
B+ = {B, C, D, E}
AD+ = {A, D, B, C, E} A, AD, ACD
ACD+ = {A, C, D, B, E} CD+ ={C,D,E} B, BD, CD are not super keys
Super key - set of all attribute whose attribute closure contains all the attributes of given relation
Candidate key – proper subsets of a super key are not super keys then it is a candidate key
A doesn’t have proper subset so it is a candidate key
AD – A, D. here A is super key so AD is not a candidate key
R(A,B,C,D,E}
FD: {AB,
DE}
ADE+ = {A, D, E, B}
A B C D E + = {A, C, D, E, B}
A C D E + = {A, C, D, B, E}
A C D + = {A, C, D, B , E}
Now check for its proper subset
AC, AD, CD, A, C, D
Prime attributes :- those attribute which are part of candidate key
Prime attributes :- A, C, D
Sk A B C D+ = {A, B, C, D}
Sk A B C D+ = {A, C, D, B}
Sk A C D+ = {A, D, B, C}
Sk A D+ = {A, D, B, C}
A + = {A, B, C}
D + = {D}
Prime attributes :- A D
C D
C + = {C, A, B}
C D is a candidate key B D
B + = {B, C, A}
Prime attributes :- A C D
B D is a candidate key Here, AD, BD, CD are candidate
keys and A, B, C, D are prime
Prime attributes :- A B C D
attributes
R(A, B, C, D)
F D: {AB CD, DB, CA}
A B C D E F + = {A, B, D, E, F, C} CB
C + ={C, D, E, F, A, B}
A B C D E F + = {A, B, F, C, D, E} C B is not a candidate key
SK A B C D E F + = {A, B, C, D, E, F} Prime attributes are A, B, D, C
A + = {A} AB
B + = {B} We don’t need to go further
A B is a candidate key
Prime attribute are A, B
DB
D + ={D, A}
Prime attributes are A, B, C, D
D B is a candidate key Non prime attributes are E and F
Prime attribute are A, B, D
AC Candidate keys are
C + ={C, D, E, F, A, B} AB, DB, C
There are all attributes so A C is a super key so AC is not a candidate key
R(ABCDEFGH)
Customer
Ano Balance Bname
A01 5000 kat
A01 5000 pok
A02 5000 kat
A02 5000 pok
Lossless decomposition
The decomposition of relation R into R1 and R2 is lossless when Customer
the join of R1 and R2 produces the same relation as in R. Ano Balance Bname
This is also referred as a non-additive (non-loss) decomposition.
A01 5000 kat
All decompositions must be lossless.
A02 5000 pok
Table-1 Table-2
Ano Balance Ano Bname
A01 5000 Same A01 kat
A02 5000 A02 pok
Customer
Ano Balance Bname
A01 5000 kat
A02 5000 pok
Dependency Preserving Decomposition
FD:- A B C
FD:- A B FD:-
FD:- A B
FD:- A B C
FD:- A B FD:- A C
FD:- A B, A C
A BC FD:- C D, D C
A+ = {A, B, C, D} B CA
B+ = {B,C,D,A} C AB
C+ = {C,D,A,B}
FD1 union FD2
R1 ( A, B)
R2 ( B, C)
Emp_Dept
EID Ename City DID Dname Manager
An insert anomaly occurs when certain attributes cannot be
1 Raj kat 1 CE Shah
inserted into the database without the presence of another
2 ravi pok 1 CE Shah attribute.
NULL NULL NULL 2 IT NULL
Want to insert new department detail
(IT)
Suppose a new department (IT) has been started by the organization but initially there is no employee
appointed for that department.
We want to insert that department detail in Emp_Dept table.
But the tuple for this department cannot be inserted into this table as the EID will have NULL value,
which is not allowed because EID is primary key.
This kind of problem in the relation where some tuple cannot be inserted is known as insert anomaly.
Delete anomaly
Consider a relation Emp_Dept(EID, Ename, City, DID, Dname, Manager) EID as a primary key
Emp_Dept
Now consider there is only one employee in some department (IT) and that employee
leaves the organization.
So we need to delete tuple of that employee (Jay).
But in addition to that information about the department also deleted.
This kind of problem in the relation where deletion of some tuples can lead to loss of some
other data not intended to be removed is known as delete anomaly.
Update anomaly
Consider a relation Emp_Dept(EID, Ename, City, Dname, Manager) EID as a primary key
Emp_Dept
What we do in normalization?
Normalization generally involves splitting an existing table into multiple (more than one) tables, which can be re-
joined or linked each time a query is issued (executed).
How many normal forms are there?
Normal forms:
1NF (First normal form)
2NF (Second normal form)
3NF (Third normal form)
BCNF (Boyce–Codd normal form)
4NF (Forth normal form)
A relation R is in first normal form (1NF) if and only if it does not contain any composite
attribute or multi-valued attributes or their combinations.
OR
A relation R is in first normal form (1NF) if and only if all underlying domains contain
atomic values only.
1NF (First Normal Form) [Example - Composite attribute]
Customer
CID Name Address
• In customer relation address is
C01 Raju kalimati Road, kat composite attribute which is
C02 ram kec galli, kalimati further divided into sub-attributes
as “Road” and “City”.
C03 Jay prithivi chwok, pok
• So customer relation is not in 1NF.
Problem: It is difficult to retrieve the list of customers living in ’kalimati’ city from customer
table.
The reason is that address attribute is composite attribute which contains road name as well
as city name in single cell.
It is possible that city name word is also there in road name.
In our example, ’kalimati’ word occurs in both records, in first record it is a part of road name
and in second one it is the name of city.
1NF (First Normal Form) [Example - Composite attribute]
Customer Customer
CID Name Address CID Name Road City
C01 Raju kalimati Road, kat C01 Raju kalimati Road kat
C02 ram kec galli, kalimati C02 ram kec galli kalimati
C03 Jay prithivi chwok, pok C03 Jay prithivi chwok pok
Person
PID Full_Name City
P01 Raju Maheshbhai singh kat
1NF (First Normal Form) [Example - Multivalued attribute]
Student
Rno Name FailedinSubjects
101 Raju DS, DBMS • In student relation FailedinSubjects
attribute is a multi-valued attribute
102 ram DBMS, DS
which can store more than one values.
103 Jay DS, DBMS, DE • So above relation is not in 1NF.
104 laxman DBMS, DE, DS
Problem: It is difficult to retrieve the list of students failed in ’DBMS’ as well as ’DS’ but not
in other subjects from student table.
The reason is that FailedinSubjects attribute is multi-valued attribute so it contains more than
one value.
1NF (First Normal Form) [Example - Multivalued attribute]
Student Student Result
Rno Name FailedinSubjects Rno Name RID Rno Subject
101 Raju DS, DBMS 101 Raju 1 101 DS
102 ram DBMS, DS 102 ram 2 101 DBMS
103 Jay DS, DBMS, DE 103 Jay 3 102 DBMS
104 laxman DBMS, DE, DS 104 laxman 4 102 DS
105 ravi DE, DBMS, DS 105 ravi 5 103 DS
106 gaurav DE, DBMS 106 gaurav … … …
Solution: Split the table into two tables in such as way that
the first table contains all attributes except multi-valued attribute with same primary key and
second table contains multi-valued attribute and place a primary key in it.
insert the primary key of first table in the second table as a foreign key.
2NF (Second Normal Form)
2NF (Second Normal Form)
Conditions for 2NF
Problem: For example, in case of a joint account multiple (more than one) customers have common (one)
accounts.
If an account ’A01’ is operated jointly by two customers says ’C01’ and ’C02’ then data values for
attributes Balance and BranchName will be duplicated in two different tuples of customers ’C01’ and ’C02’.
2NF (Second Normal Form) [Example]
Customer Table-1 Table-2
CID ANO AccessDate Balance BranchName ANO Balance BranchName CID ANO AccessDate
C01 A01 01-01-2017 50000 kat A01 50000 kat C01 A01 01-01-2017
C02 A01 01-03-2017 50000 kat A02 25000 pok C02 A01 01-03-2017
Solution: Decompose relation in such a way that resultant relations do not have any partial FD.
Remove partial dependent attributes from the relation that violets 2NF.
Place them in separate relation along with the prime attribute on which they are fully dependent.
The primary key of new relation will be the attribute on which it is fully dependent.
Keep other attributes same as in that table with the same primary key.
3NF (Third Normal Form)
3NF (Third Normal Form)
Conditions for 3NF
Problem: In this relation, branch address will be stored repeatedly for each account of the same branch
which occupies more space.
3NF (Third Normal Form) [Example]
Customer Table-1 Table-2
ANO Balance BranchName BranchAddress BranchName BranchAddress ANO Balance BranchName
A01 50000 kat Kalmati kat Kalmati A01 50000 kat
A02 40000 kat Kalmati pok prithivi chwok A02 40000 kat
Solution: Decompose relation in such a way that resultant relations do not have any
transitive FD.
Remove transitive dependent attributes from the relation that violets 3NF.
Place them in a new relation along with the non-prime attributes due to which transitive
dependency occurred.
The primary key of the new relation will be non-prime attributes due to which transitive
dependency occurred.
Keep other attributes same as in the table with same primary key and add prime attributes
of other relation into it as a foreign key.
BCNF (Boyce-Codd Normal Form)
BCNF (Boyce-Codd Normal Form)
Primary Determinant Dependent
Conditions for BCNF
Key
BCNF is based on the concept of a determinant. AccountNO → {Balance, Branch}
101 DBMS KC Problem: In this relation one student can learn more than one subject with
different faculty then records will be stored repeatedly for each student,
105 DS kc
language and faculty combination which occupies more space.
• Here, one faculty teaches only one subject, but a subject may
be taught by more than one faculty.
• A student can learn a subject from only one faculty.
BCNF (Boyce-Codd Normal Form) [Example]
Student Table-1 Table-2
RNO Subject Faculty Faculty Subject RNO Faculty
101 DS singh singh DS 101 singh • Solution: Decompose relation in such a
102 DBMS Shah Shah DBMS 102 Shah way that resultant relations do not have
any transitive FD.
103 DS kc kc DS 103 kc • Remove transitive dependent prime
attribute from relation that violets BCNF.
104 DBMS KC KC DBMS 104 KC
• Place them in separate new relation along
105 DBMS Shah 105 Shah with the non-prime attribute due to which
transitive dependency occurred.
102 DS singh 102 singh • The primary key of new relation will be this
non-prime attribute due to which transitive
101 DBMS KC 101 KC
dependency occurred.
105 DS kc 105 kc • Keep other attributes same as in that table
with same primary key and add a prime
attribute of other relation into it as a
foreign key.
Multivalued dependency (MVD)
For a dependency X → Y, if for a single value of X, multiple values of Y exists, then
the table may have multi-valued dependency.
Student
RNO Subject Faculty
101 DS singh
101 DS Shah
Above student table has multivalued dependency. So student table is not in 4NF.
Functional dependency & Multivalued dependency
A table can have both functional dependency as well as multi-valued dependency together.
RNO → Address
RNO →→ Subject
RNO →→ Faculty
Let a relation R with attributes ABCD with FDs B → C, D → A. Find keys for relation R.
The core is BD. B determines C and D determines A, so BD is a key. Therefore BD is the key.
Let a relation R with attributes ABCD with FDs A → B, BC → D and A → C. Find keys for relation R.
The core is A. A determines B and C which determine D, so A is a key. Therefore A is the key.
Find (candidate) key & check for normal forms [Example]
Suppose you are given a relation R with four attributes ABCD. For each of the following sets of FDs,
do the following: F = (B → C, D → A)
Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).
Candidate Key is BD
Relation R is in 1NF but not 2NF. In above FDs, there is a partial dependency
(As per FD B → C, C depends only on B but Key is BD so C is partial depends on key (BD))
(As per FD D → A, A depends only on D but Key is BD so A is partial depends on key (BD))
Find (candidate) key & check for normal forms [Example]
Suppose you are given a relation R with four attributes ABCD. For each of the following sets of FDs,
do the following: F = (C → D, C → A, B → C)
Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).
Candidate Key is B
Relation R is in 2NF but not 3NF. In above FDs, there is a transitive dependency
(As per FDs B → C & C → D then B → D so D is transitive depends on key (B))
(As per FDs B → C & C → A then B → A so A is transitive depends on key (B))
Find (candidate) key & check for normal forms [Example]
Suppose you are given a relation R with four attributes ABCD. For each of the following sets of FDs,
do the following: F = (A → B, BC → D, A → C)
Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).
Candidate Key is A
Relation R is in 2NF but not 3NF. In above FDs, there is a transitive dependency
(As per FDs A → B & A → C then A → BC using union rule) and
(As per FDs A → BC & BC → D then A → D so D is transitive depends on key (A))
Find (candidate) key & check for normal forms [Example]
Suppose you are given a relation R with four attributes ABCD. For each of the following sets of FDs,
do the following: F = (ABC → D, D → A)
Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).
UNF
Employee Employee Date of Department Department Project Project Project
Number Name Birth Code Name Code Description Supervisor
1 Raj 1-1-85 1 CE 1 IOT singh