Chap 4 Database Constraints and Normalization
Chap 4 Database Constraints and Normalization
000-124-456 The Old Man and The Sea Rs. 97 Ernest Hemingway
ation 978-81-291-0818-0 One Night @ The Call Rs. 200 Chetan Bhagat
Center
000-124-456 The Old Man and The Sea Rs. 97 Ernest Hemingway
Foreign Key
000-124-456 The Old Man and The Sea Rs. 97 Ernest Hemingway
➢Keys are the entity set that is used to identify an entity within its
entity set uniquely.
DBA U2 U5
U3
Here Roll_no attribute can uniquely identify the Name attribute of Student table
because if we know the Roll_no, we can tell that student name associated with it.
Functional dependency can be written as:
Roll_no → Name
We can say that Name is functionally dependent on Roll_no.
• Fully-Functional Dependency
• Partial Dependency
• Transitive Dependency
• Multivalued Dependency
• Trivial Functional Dependency
• Non trivial Functional Dependency
ProjectID → ProjectCost
Here ProjectCost is fully functionally dependent on ProjectID
(Company} -> {CEO} (if we know the Company, we knows the CEO name)
But CEO is not a subset of Company, and hence it's non-trivial functional
dependency.
In this example, projectNo and Hobby are independent of each other but
dependent on Name. In this example, these two columns are said to be
multivalue dependent on Name.
This dependence can be represented like this:
Name →→ ProjectNo
Name →→ Hobby
A→A
F+=F
repeat
for each functional dependency f in F+
apply reflexivity and augmentation rules on f
add the resulting functional dependencies to F +
for each pair of functional dependencies f1and f2 in F +
if f1 and f2 can be combined using transitivity
then add the resulting functional dependency to F +
until F + does not change any further
Example
Given a relation R = (A, B, C, G, H, I) with a set of FD F = { A → B, A → C,
CG → H, CG → I, B → H}. Find the closure of FD, F+.
A→H by transitivity from A → B and B → H
AG → I by augmenting A → C with G, to get AG → CG
and then transitivity with CG → I
CG → HI by augmenting CG → I to infer CG → CGI,
and augmenting of CG → H to infer CGI → HI, and then transitivity
Practice:
1. Given R= { A,B,C,D} and set of FDs: F={ A→BC, B→AC, C→AB} find F+ .
2. Given R={ A,B,C,D,E} and F be the set of FDs: F={ A→BC, B→CD, E→ A,
C→ED}. Find the closure set of FD.
result := ;
while (changes to result) do
for each → in F do
begin
if result then result := result
end
Example of Attribute Set Closure
• R = (A, B, C, G, H, I)
• F = {A → B, A → C ,CG → H,CG → I, B → H}
• (AG)+
1. result = AG
2. result = ABCG (A → C and A → B)
3. result = ABCGH (CG → H and CG AGBC)
4. result = ABCGHI (CG → I and CG AGBCH)
• Is AG a candidate key?
1. Is AG a super key?
1. Does AG → R? == Is (AG)+ R YES
2. Is any subset of AG a superkey?
1. Does A → R? == Is (A)+ R Result of A+= {ABCH} So NO
2. Does G → R? == Is (G)+ R
Practice:
1. Given R= {A,B,C,D,E,F} and the set of Fds: { A→B,C→DE,
AC→F,D→AF,E→CF}. Determine closure set of attributes for A, B, C,
DE
2. Given R= {A,B,C,D,E,F,G} and the set of Fds: { A→B,BC→DE, AEG→G}.
Determine attribute Closure A+, AC+, ABC+
Number of Tables
• Third Normal Form (3NF)
Redundancy
Complexity
• Boyce-Codd Normal Form (BCNF)
• Fourth Normal Form (4NF)
• Fifth Normal Form (5NF)
• Domain Key Normal Form (DKNF)
Most databases should be 3NF or BCNF in order to avoid the database anomalies.
104
Update anomaly: In the above table we have two rows for employee Sabin
as he belongs to two departments of the company. If we want to update the
address of Sabin then we have to update the same in two rows or the data
will become inconsistent. If somehow, the correct address gets updated in
one department but not in other then as per the database, Sabin would be
having two different addresses, which is not correct and would lead to
inconsistent data.
Insert anomaly: Suppose a new employee joins the company, who is under
training and currently not assigned to any department then we would not be
able to insert the data into the table if emp_dept field doesn’t allow nulls.
Delete anomaly: Suppose, if at a point of time the company closes the
department D890 then deleting the rows that are having emp_dept as D890
would also delete the information of employee Mohan since he is assigned
only to this department.
105
First Normal Form
• We say a relation is in 1NF if all values stored in the relation are single-
valued and atomic.
• 1NF places restrictions on the structure of relations.
• Values must be simple.
91.2914 107
First Normal Form
EmpNum EmpPhone EmpDegrees
123 233-9876
333 233-1231 BA, BSc, PhD
679 233-1231 BSc, MSc
91.2914 108
First Normal Form
EmployeeDegree
Employee
EmpNum EmpDegree
EmpNum EmpPhone
333 BA
123 233-9876
333 BSc
333 233-1231
333 PhD
679 233-1231
679 BSc
679 MSc
91.2914 109
Second Normal Form
Second Normal Form
A relation is in 2NF if it is in 1NF, and every non-key attribute
is fully dependent on each candidate key. (That is, we don’t
have any partial functional dependency.)
91.2914 110
Second Normal Form
Consider this InvoiceLine table. Table InvoiceLine is only in 1NF
InvNum LineNum ProdNum Qty InvDate
InvNum, LineNum ProdNum, Qty
There are two
candidate keys.
Qty is the only non-
key attribute, and it is
InvNum InvDate
dependent on InvNum
Table InvoiceLine is not 2NF since there is
a partial dependency of InvDate on InvNum
91.2914 111
Second Normal Form
InvoiceLine
InvNum LineNum ProdNum Qty InvDate
The above relation has redundancies: the invoice date is
repeated on each invoice line number.
We can improve the database by decomposing the relation
into two relations:
InvNum LineNum ProdNum Qty
InvNum InvDate
91.2914 112
Is the following relation in 2NF? Prod_no → prod_desc;
Transitive dependency
occurs. So it is not in 3NF
91.2914 113
Third Normal Form
• A database is in third normal form if it satisfies the following conditions:
▪ It is in second normal form
▪ There is no transitive functional dependency
• By transitive functional dependency, we mean we have the following
relationships in the table: A is functionally dependent on B, and B is
functionally dependent on C. In this case, C is transitively dependent on A
via B.
• This definition of 3NF differs from BCNF only in the specification of non-
key attributes - 3NF is weaker than BCNF. (BCNF requires all determinants
to be candidate keys.)
• A relation in 3NF will not have any transitive dependencies
of non-key attribute on a candidate key through another non-key attribute.
91.2914 114
Third Normal Form
Consider this Employee relation Candidate keys
are? …
91.2914 115
Third Normal Form
91.2914 116
Boyce-Codd Normal Form or BCNF
91.2914 118
student_no course_no instr_no
student_no instr_no
course_no instr_no
1 C# dance
2 C# dance
2 Php dance
hobby
S_id Hobby
1 music
1 dance
2 dance