DBMS
DBMS
DATABASE DESIGN
Functional Dependencies Non-loss Decomposition Functional Dependencies First, Second,
Third Normal Forms, Dependency Preservation Boyce/Codd Normal Form-Multi-valued
Dependencies and Fourth Normal Form Join Dependencies and Fifth Normal Form
Introduction
Relational database design requires that we find a good collection of relation schemas.
Pit-falls in Relational Database Design
A bad design may lead to
Repetition of information
Inability to represent certain information
Design Goals
a) Avoid redundant data.
b) Ensure that relationships among attributes are represented.
c) Facilitate the checking of updates for violation of database integrity
constraints.
Example: Consider the relation schema:
Lending-schema=(branch_name,branch_city,assets,customer_name,loan_no,amount).
branch_name branch_city assets customer_name loan_no amount
Downtown
Redwood
Porryride
Downtown
Brooklyn
Palo Alto
Horseneck
Brooklyn
90,00,000
21,00,000
17,00,000
90,00,000
Jones
Smith
Hayes
Jackson
L-17
L-23
L-15
L-14
1000
2000
1500
3500
Here branch Downtown details are represented 2 times. This leads to a redundancy problem.
Redundancy
Data for branch_name,branch_city,assets are repeated for each loan that a branch makes:
(a) wastes space.
(b) complicates updating,introducing inconsistency of assets value.
Null values
(a)
(b)
Decomposition
Decompose the
All attributes of original schema (R) must appear in decomposition (R1,R2).
Lossless join decompositio
All possible relations r on schema R.
Goal: To decide whether a particular relation R is in good form, decompose it 9into a set of
relations (R
Each relation is in good form.
The decomposition should be los
Example of Non Lossless
Decomposition of
Functional Dependencies
It requires that the value for a certain set of attributes determines uniquely the value for
another set of
In a given relation R, X and Y are attributes. Attributes Y is functionally dependent on
attribute X if each value of X determines exactly one value of Y, which is represented as
Null values
(a) Cannot store information about a branch if no loan exist.
(b) Can use null values, but they are difficult to handle.
Decomposition
Decompose the
Branch
Loan-
All attributes of original schema (R) must appear in decomposition (R1,R2).
R = R1UR2
Lossless join decompositio
All possible relations r on schema R.
r =
R1
To decide whether a particular relation R is in good form, decompose it 9into a set of
relations (R
1
,R
2
,,R
Each relation is in good form.
The decomposition should be los
Example of Non Lossless
Decomposition of
R
1
= (A)
Functional Dependencies
It requires that the value for a certain set of attributes determines uniquely the value for
another set of
In a given relation R, X and Y are attributes. Attributes Y is functionally dependent on
attribute X if each value of X determines exactly one value of Y, which is represented as
Cannot store information about a branch if no loan exist.
Can use null values, but they are difficult to handle.
Decompose the relation schema, lending schema into
Branch-schema = (branch_name,branch_city,assets)
-schema = (customer_name,loan_no,branch_name,amount)
All attributes of original schema (R) must appear in decomposition (R1,R2).
R = R1UR2
Lossless join decompositio
All possible relations r on schema R.
R1
(r)
R2
(r)
To decide whether a particular relation R is in good form, decompose it 9into a set of
,,R
n
) such that
Each relation is in good form.
The decomposition should be los
Example of Non Lossless-Join Decomposition
Decomposition of R = (A, B)
= (A) R
2
Functional Dependencies
It requires that the value for a certain set of attributes determines uniquely the value for
another set of attributes.
In a given relation R, X and Y are attributes. Attributes Y is functionally dependent on
attribute X if each value of X determines exactly one value of Y, which is represented as
X
Cannot store information about a branch if no loan exist.
Can use null values, but they are difficult to handle.
relation schema, lending schema into
schema = (branch_name,branch_city,assets)
schema = (customer_name,loan_no,branch_name,amount)
All attributes of original schema (R) must appear in decomposition (R1,R2).
Lossless join decomposition.
All possible relations r on schema R.
(r)
To decide whether a particular relation R is in good form, decompose it 9into a set of
Each relation is in good form.
The decomposition should be lossless decomposition.
Join Decomposition
R = (A, B)
= (B)
It requires that the value for a certain set of attributes determines uniquely the value for
In a given relation R, X and Y are attributes. Attributes Y is functionally dependent on
attribute X if each value of X determines exactly one value of Y, which is represented as
> Y
Cannot store information about a branch if no loan exist.
Can use null values, but they are difficult to handle.
relation schema, lending schema into
schema = (branch_name,branch_city,assets)
schema = (customer_name,loan_no,branch_name,amount)
All attributes of original schema (R) must appear in decomposition (R1,R2).
To decide whether a particular relation R is in good form, decompose it 9into a set of
sless decomposition.
Join Decomposition
It requires that the value for a certain set of attributes determines uniquely the value for
In a given relation R, X and Y are attributes. Attributes Y is functionally dependent on
attribute X if each value of X determines exactly one value of Y, which is represented as
Cannot store information about a branch if no loan exist.
Can use null values, but they are difficult to handle.
relation schema, lending schema into
schema = (branch_name,branch_city,assets)
schema = (customer_name,loan_no,branch_name,amount)
All attributes of original schema (R) must appear in decomposition (R1,R2).
To decide whether a particular relation R is in good form, decompose it 9into a set of
sless decomposition.
It requires that the value for a certain set of attributes determines uniquely the value for
In a given relation R, X and Y are attributes. Attributes Y is functionally dependent on
attribute X if each value of X determines exactly one value of Y, which is represented as
schema = (customer_name,loan_no,branch_name,amount)
All attributes of original schema (R) must appear in decomposition (R1,R2).
To decide whether a particular relation R is in good form, decompose it 9into a set of
It requires that the value for a certain set of attributes determines uniquely the value for
In a given relation R, X and Y are attributes. Attributes Y is functionally dependent on
attribute X if each value of X determines exactly one value of Y, which is represented as
All attributes of original schema (R) must appear in decomposition (R1,R2).
To decide whether a particular relation R is in good form, decompose it 9into a set of
It requires that the value for a certain set of attributes determines uniquely the value for
In a given relation R, X and Y are attributes. Attributes Y is functionally dependent on
attribute X if each value of X determines exactly one value of Y, which is represented as
To decide whether a particular relation R is in good form, decompose it 9into a set of
It requires that the value for a certain set of attributes determines uniquely the value for
In a given relation R, X and Y are attributes. Attributes Y is functionally dependent on
attribute X if each value of X determines exactly one value of Y, which is represented as
i.e., X determines Y or Y is functionally dependent on X
X > Y does not imply Y > X
For example, in a student relation the value of an attribute Marks is known then the
value of an attribute Grade is determined since
Marks > Grade
Types
(a) Full functional dependency
(b) Partial functional dependency
(c) Transitive functional dependency
(a)Full dependencies
In a relation R, X and Y are attributes. X functionally determines Y. Subset of X
should not functionally determine Y.
In the above example marks is fully functionally dependent on student_no and course_no
together and not on subset of {student_no course_no}.
This means marks cannot be determined either by student_no or course_no alone.It can
be determined only using student_no and course_no together.
Hence marks is fully functionally dependent on {student_no course_no}.
(b)Partial dependencies
Attribute Y is partially dependent on the attribute X only if it is dependent on a subset of
attribute X.
For example course_name,Instructer_name are partially dependent on composite
attributes {student-no,course_no} because course_no alone defines
course_name,Instructor_name.
(c)Transitive dependencies
X, Y and Z are 3 attributes in the relation R.
=>
For example, grade depends on marks and in turn make depends on
{student_no course_no}, hence Grade depends fully transitively on {student_no & course_no}.
Normalization Using Functional Dependencies
When we decompose a relation schema R with a set of functional dependencies F into R
1
,
R
2
,.., R
n
we want
o Lossless-join decomposition: Otherwise decomposition would result in
information loss.
o No redundancy: The relations R
i
preferably should be in either Boyce-Codd
Normal Form or Third Normal Form.
o Dependency preservation: Let F
i
be the set of dependencies F
+
that include only
attributes in R
i
.
Preferably the decomposition should be dependency preserving, that is,
(F
1
F
2
F
n
)
+
= F
+
Otherwise, checking updates for violation of functional dependencies may
require computing joins, which is expensive.
Example
R = (A, B, C)
F = {A B, B C)
o Can be decomposed in two different ways
R
1
= (A, B), R
2
= (B, C)
o Lossless-join decomposition:
R
1
R
2
= {B} and B BC
o Dependency preserving
R
1
= (A, B), R
2
= (A, C)
o Lossless-join decomposition:
R
1
R
2
= {A} and A AB
o Not dependency preserving
(cannot check B C without computing R
1
R
2
)
Functional Dependency Theory
X > Y
Y > Z
X > Z
Constraints on the set of legal relations.
Require that the value for a certain set of attributes determines uniquely the value for
another set of attributes.
A functional dependency is a generalization of the notion of a key.
Let R be a relation schema
R and R
The functional dependency holds on R if and only if for any legal relations r(R),
whenever any two tuples t
1
and t
2
of r agree on the attributes , they also agree on the
attributes . That is,
t
1
[] = t
2
[] t
1
[ ] = t
2
[ ]
Example: Consider r(A,B ) with the following instance of r.
On this instance, A B does NOT hold, but B A does hold.
K is a superkey for relation schema R if and only if K R
K is a candidate key for R if and only if
o K R, and
o for no K, R
Functional dependencies allow us to express constraints that cannot be expressed using
superkeys. Consider the schema:
bor_loan = (customer_id, loan_number, amount ).
We expect this functional dependency to hold:
o loan_number amount
but would not expect the following to hold:
o amount customer_name
Use of Functional Dependencies
We use functional dependencies to:
o test relations to see if they are legal under a given set of functional dependencies.
If a relation r is legal under a set F of functional dependencies, we say
that r satisfies F.
o specify constraints on the set of legal relations
We say that F holds on R if all legal relations on R satisfy the set of
functional dependencies F.
Note: A specific instance of a relation schema may satisfy a functional dependency even
if the functional dependency does not hold on all legal instances.
o For example, a specific instance of loan may, by chance, satisfy
amount customer_name.
A functional dependency is trivial if it is satisfied by all instances of a relation
o Example:
customer_name, loan_number customer_name
customer_name customer_name
o In general, is trivial if
Closure of a set of Functional Dependencies
Given a set F set of functional dependencies, there are certain other functional
dependencies that are logically implied by F.
o For example: If A B and B C, then we can infer that A C
The set of all functional dependencies logically implied by F is the closure of F.
We denote the closure of F by F
+
.
We can find all of F
+
by applying Armstrongs Axioms:
o if , then (reflexivity)
o if , then (augmentation)
o if , and , then (transitivity)
These rules are
o sound (generate only functional dependencies that actually hold) and
o Complete (generate all functional dependencies that hold).
We can further simplify manual computation of F
+
by using the following additional
rules.
o If holds and holds, then holds (union)
o If holds, then holds and holds (decomposition)
o If holds and holds, then holds (pseudotransitivity)
o The above rules can be inferred from Armstrongs
axioms.
Example
R = (A, B, C, G, H, I)
F = { A B
A C
CG H
CG I
B H}
some members of F
+
o A H
by transitivity from A B and B H
o AG I
by augmenting A C with G, to get AG CG
and then transitivity with CG I
o CG HI
by augmenting CG I to infer CG CGI,
and augmenting of CG H to infer CGI HI, and then
transitivity
Procedure for Computing F
+
To compute the closure of a set of functional dependencies F:
F
+
= F
repeat
for each functional dependency f in F
+
apply reflexivity and augmentation rules on f
add the resulting functional dependencies to F
+
for each pair of functional dependencies f
1
and f
2
in F
+
if f
1
and f
2
can be combined using transitivity
then add the resulting functional dependency to F
+
until F
+
does not change any further
NOTE: We shall see an alternative procedure for this task later
Closure of Attributes Sets
Given a set of attributes a, define the closure of a under F (denoted by a
+
) as the set of
attributes that are functionally determined by a under F
Algorithm to compute a
+
, the closure of a under F
result := a;
while (changes to result) do
for each in F do
begin
if result then result := result
end
Example of Attribute Set Closure
R = (A, B, C, G, H, I)
F = {A B
A C
CG H
CG I
B H}
(AG)
+
1. result = AG
2. result = ABCG (A C and A B)
3. result = ABCGH (CG H and CG AGBC)
4. result = ABCGHI (CG I and CG AGBCH)
Is AG a candidate key?
1. Is AG a super key?
1. Does AG R? == Is (AG)
+
R
2. Is any subset of AG a superkey?
1. Does A R? == Is (A)
+
R
2. Does G R? == Is (G)
+
R
Uses of Attribute Closure
There are several uses of the attribute closure algorithm:
Testing for superkey:
o To test if is a superkey, we compute
+,
and check if
+
contains all attributes of
R.
Testing functional dependencies
o To check if a functional dependency holds (or, in other words, is in F
+
),
just check if
+
.
o That is, we compute
+
by using attribute closure, and then check if it contains .
o Is a simple and cheap test, and very useful
Computing closure of F
o For each R, we find the closure
+
, and for each S
+
, we output a functional
dependency S.
Canonical Cover
Sets of functional dependencies may have redundant dependencies that can be inferred
from the others
o For example: A C is redundant in: {A B, B C}
o Parts of a functional dependency may be redundant
E.g.: on RHS: {A B, B C, A CD} can be simplified to
{A B, B C, A D}
E.g.: on LHS: {A B, B C, AC D} can be simplified to
{A B, B C, A D}
Intuitively, a canonical cover of F is a minimal set of functional dependencies
equivalent to F, having no redundant dependencies or redundant parts of dependencies
Extraneous Attributes
Consider a set F of functional dependencies and the functional dependency in F.
o Attribute A is extraneous in if A
and F logically implies (F { }) {( A) }.
o Attribute A is extraneous in if A
and the set of functional dependencies
(F { }) { ( A)} logically implies F.
Note: implication in the opposite direction is trivial in each of the cases above, since a
stronger functional dependency always implies a weaker one
Example: Given F = {A C, AB C }
o B is extraneous in AB C because {A C, AB C} logically implies A C
(I.e. the result of dropping B from AB C).
Example: Given F = {A C, AB CD}
o C is extraneous in AB CD since AB C can be inferred even after deleting C.
Testing if an Attribute is Extraneous
Consider a set F of functional dependencies and the functional dependency in F.
To test if attribute A is extraneous in
compute ({} A)
+
using the dependencies in F
check that ({} A)
+
contains ; if it does, A is extraneous in
To test if attribute A is extraneous in
compute
+
using only the dependencies in
F = (F { }) { ( A)},
check that
+
contains A; if it does, A is extraneous in
Definition of Canonical Cover
A canonical cover for F is a set of dependencies F
c
such that
o F logically implies all dependencies in F
c,
and
o F
c
logically implies all dependencies in F, and
o No functional dependency in F
c
contains an extraneous attribute, and
o Each left side of functional dependency in F
c
is unique.
To compute a canonical cover for F:
repeat
Use the union rule to replace any dependencies in F
1
1
and
1
2
with
1
1
2
Find a functional dependency with an
extraneous attribute either in or in
If an extraneous attribute is found, delete it from
until F does not change
Note: Union rule may become applicable after some extraneous attributes have been
deleted, so it has to be re-applied
Computing a Canonical Cover
R = (A, B, C)
F = {A BC
B C
A B
AB C}
Combine A BC and A B into A BC
o Set is now {A BC, B C, AB C}
A is extraneous in AB C
o Check if the result of deleting A from AB C is implied by the other
dependencies
Yes: in fact, B C is already present!
o Set is now {A BC, B C}
C is extraneous in A BC
o Check if A C is logically implied by A B and the other dependencies
Yes: using transitivity on A B and B C.
Can use attribute closure of A in more complex cases
The canonical cover is: A B
B C
Normalization
o Normalization is an essential part of database design.
o The concept of normalization helps the designer to built efficient design.
Purpose of Normalization:
Minimize redundancy in data.
Remove insert, delete and update anamoly during database activities.
Reduce the need to reorganize data when it is modified or enhanced.
Normalization reduces a complex user view to a set of small and stable subgroups of
fields/relations.
This process elps to design a logical data model known as conceptual data model.
Normalization Forms:
Different normalization forms are:
1.First normal form (1NF):
2.Second normal form (2NF):
3.Third normal form (3NF)
First normal form (1NF): A relation is said to be in the first normal form if it is already in
unnormalized form and it has no repeating group.
Second normal form (2NF):
A relation is said to be in second normal form if it is already in the first normal form and it has no
partial dependency.
Third normal form (3NF) : A relation is said to be in third normal form if it is already in second
normal form and it has no transitive dependency.
1. Boyce-Codd normal form(BCNF): A relation is said to be in Boyce-Codd normal form if
it is already in third normal form and every determinant is a candidate key. It is a stronger
version of 3NF
2. Fourth normal form (4NF) : A relation is said to be in fourth normal form if it is
already in BCNF and it has no multivalued dependency.
3. Fifth normal form (5NF) : A relation is said to be in 5NF if it is already in 3NF and has no
join dependency.
Multivalued dependency:
Consider three fields X, Y and Z in a relation.
If for each value of X, there is well-defined set of values of Y and a well-defined set of values of
Z.
The set of values of Y is independent of the set of values of Z, and then multivalued dependency
exists. i.e. X Y/Z
Join dependency:
A relation which has a joint dependency cannot be decomposed by projection into other
relations without any difficulty and undesirable results.
Normal forms:
Case Problem:
Let us consider the Invoice Report of Alpha Book House as shown below
Alpha Book House
Pune-413001
Customer No. 1052
Fig: Invoice of a book company.
Normalization of Invoice report upto 3NF:
The invoice report shown in the figure is represented in a form of relation.
The relation is named as invoice relation.
This is in unnormalized form.
1. Invoice (Cust_no, Cust_Name, Cust_Add, (ISBN, Title, Auother_Name, Auother_Country,
Qty, Unity_price))
The amount in the last column of the invoice for each book can be computed by multiplying
the number of copies (Qty) with respect to unit price.
Similarly the grant total at the bottom of the invoice representing the invoice amount can be
calculated by summing the entries in the last column of the invoice.
First normal form relations (1NF)
A relation is said to be in the first normal form if it is already in unnormalized form and it has no
repeating group.
In the invoice relation, the field in the inner most of parentheses put together is known as a repeating
group.This will result in redundancy of data for the first three fields.The redundancy will lead to
inconsistency of data.Hence it is divided into two sub-relations to remove the repeating group as follows
2. Customer (Cust_No, Cust_Name, Cust_Add)
3. Customer_Book (Cust_No, ISBN, Title, Author_Name, Author_Country, Qty, Unit_Price)
Each of the above relation (2&3) is in 1NF.
Second normal form (2NF)
Customer Name: Beta school of computer science
Address: shivaji Nager, Pune-01.
ISBN Book Title Authors
Name
Authors
country
Qty Price(Rs) Amount(Rs)
81-203-5 DOS P.K.Sinha India 5 250 1250
0-112-6 DBMS Korth U.S.A 6 300 1800
1-213-9 simulation Gordon U.S.A 5 100 500
Grand Total 3500
A relation is said to be in second normal form if it is already in the first normal form and it has
no partial dependency.
In a relation having more than one key field, a subset of non-key field may depend on the entire
Key field. But another subset/ a particular non-key field may depend on only one of the key
fields. Such dependency is called partial dependency.
2NF removes partial dependency among attributes of relation
In Relation 2, the number of key fields is only one and hence there is no scope for partial
dependency.
The absence of partial dependency in Relation 2 takes it into 2NF without any modification.
In Relation 3, the number of key field is two. The dependency diagram of relation 3 is shown
below:
Figure: Dependency diagram of Relation 3.
In the above figure Qty depends on cust-No and ISBN, but the remaining non-key fields
(Title,Author_Name,Author_Country, Unit_Price) depends only on ISBN. Thus there exists
parallel dependency.
The existence of partial dependency will result into insertion update, and deletion anomaly.
Insertion anomaly:
In relation 3 if we want to insert data of a new book
(ISBN,Title,Author_Name,Author_Country, Unit_Price) there must be atleast one customer.
This means that the data of the new book can be entered into Relation 3 only when the first
customer buys the book.
Update anomaly:
In Relation 3 if we want to change any of the non-key fields, like Title, Author_Name, it will result
into inconsistency because the same is available in more than one record.
Deletion anomaly:
In Relation 3, if book is purchased by only one customer, then the book data will be lost when we
CusL-no
lS8n
unlL_rlce
AuLhor_CounLry
AuLhor_name
1lLle
CLy
delete the record after fully satisfying the customer order.
Hence the Relation 3 is divided into two relations as shown below:
4. Sales (Cust_No, ISBN, Qty)
5. Book_Author (ISBN, Title,Author_Name,Author_Country, Unit_Price)
The Relation 4 and 5 are now in 2NF.
Third normal form:(3NF)
A relation is said to be in third normal form if it is already in second normal form and it has no
transitive dependency.
In a relation there may be dependency among non-key fields. Such dependency is called as
transitive dependency.
Third normal form removes transitive dependency among attributes of relation.
In Relation 4, there is only one nonkey field.
So there is no question of dependency between nonkey fields. Thus there is no transitive
dependency.
Hence Relation 4 is in 3NF.
In Relation 2 there is no dependency between the nonkey field
This means that it has no transitive dependency.
Hence Relation 2 is also in 3NF.
In Relation 5, authors country depends on the authors name.
This means that Relation 5 has transitive dependency.
The dependency diagram is shown as below:
FigureDependency diagram for relation 5
The existence of transitive dependency will result into insert, update and delete anomaly.
Insertion anomaly:
Consider the book company has resident authors who are in the process of developing new book,
it will be difficult to include the authors details in the Relation 5.
lS8n
AuLhor_CounLry AuLhor_name
unlL_rlce
1lLle
This means there should be at least one published book to insert the details of a author.
Update anomaly:
If the authors country is to be modified, then it is necessary to modify number of tuples as the same
data is in number of tuples.
Deletion anomaly:
If only one book of a resident author is not reprinted, then the respective authors data will be
lost.
Hence to overcome all the anomalies, Relation 5 is subdivided into two relations:
6.Book (ISBN, Title, Unit_Price, Author_Name)
7.Author (Author_Name, Author_Country)
In Relation 6, Author name is a foreign key.
Boyce-Codd Normal Form (BCNF)
A relation is said to be in Boyce-Codd normal form if it is already in third normal form and
every determinant is a candidate key. It is a stronger version of 3NF.
A determinant is any field (simple field or composite field) on which some other field is fully
functionally dependent.
It deals with relational tables which has multiple candidate keys, composite candidate keys, and
candidate keys that overlapped.
CASE 1: Multiple candidate keys
Consider the following relational table
A# Aname AqualificationAstatus TitleID Royality
100 Arora Ph.D 10 T1 3000
110 sHARMA M.Tech 20 T2 4000
Consider Astatus is depends on Aqualification.Then this relation has three determinants: A#,
TitleID, Aqualification.
But only (A#, TitleID) combination is a candidate key. So this relation is not in BCNF.
For relation to be BCNF each determinant must be a candidate key.
CASE 2: Composite candidate keys
Suppose there are three relations as given below:
A# TitleID Royalty
A1 T1 5000
A2 T2 7000
Fig:Author-title table
Fig: Author table
Aqualification Astatus
Ph.d 10
M.Tech 20
B.Tech 30
Others 40
Fig:Author_qualification-status table
Author-Title table has one determinant (A#, TitleID). It is also a candidate key so relation is in
BCNF.
Author table has one determinant A# and it is also a candidate key so Author is in BCNF.
Similarly Author-qualification-status is also in BCNF.
CASE 3: Candidate keys that overlapped
Consider following relation
A# Aname TitileID Royality
A1 John T1 3000
A2 Tom T2 4000
Fig: Author table
Suppose that each Authors name is Unique. Then in the above relation the keys are: ( A#,
TitleID) nad (AName, TitleID). These are the two determinants A# and Aname.
The attributes of each possible relation which we can make out of the original relation(A#,
Aname, TitleID, Royalty) must depend on the two determinants (A# and Aname).
But the attributes of the relation (A# and Aname) does not depend on the determinant since each
A# and Aname is independent of the other.
A# Aname Aqualification
A1 John Ph.d
A2 Tom M.Tech
Hence the relation Author is not in BCNF. But the relation is in 3NF.
So the relation in 3NF need not be in BCNF, but the converse is true.
So BCNF is stronger than the 3NF.
BCNF Decomposition Algorithm
result := {R};
done := false;
compute F
+
;
while (not done) do
if (there is a schema R
i
in result that is not in BCNF)
then begin
let be a nontrivial functional
dependency that holds on R
i
such that R
i
is not in F
+
,
and = ;
result := (result R
i
) (R
i
) (, );
end
else done := true;
Note: each R
i
is in BCNF, and decomposition is lossless-join.
Example of BCNF Decomposition
R = (branch-name, branch-city, assets,customer-name, loan-number, amount)
F = {branch-name assets branch-city
loan-number amount branch-name}
Key = {loan-number, customer-name}
Decomposition
o R
1
= (branch-name, branch-city, assets)
o R
2
= (branch-name, customer-name, loan-number, amount)
o R
3
= (branch-name, loan-number, amount)
o R
4
= (customer-name, loan-number)
Final decomposition
R
1
, R
3
, R
4
Testing Decomposition for BCNF
To check if a relation R
i
in a decomposition of R is in BCNF,
Either test R
i
for BCNF with respect to the restriction of F to R
i
(that is, all FDs in F
+
that
contain only attributes from R
i
)
or use the original set of dependencies F that hold on R, but with the following test:
for every set of attributes R
i
, check that
+
(the attribute closure of ) either includes
no attribute of R
i
- , or includes all attributes of R
i
.
If the condition is violated by some in F, the dependency
(
+
- ) R
i
can be shown to hold on R
i
, and R
i
violates BCNF.
We use above dependency to decompose R
i
BCNF and Dependency Preservation
It is not always possible to get a BCNF decomposition that is dependency preserving
R = (J, K, L)
F = {JK L
L K}
Two candidate keys = JK and JL
R is not in BCNF
Any decomposition of R will fail to preserve
JK L
3NF Decomposition Algorithm
Let F
c
be a canonical cover for F;
i := 0;
for each functional dependency in F
c
do
if none of the schemas R
j
, 1 j i contains
then begin
i := i + 1;
R
i
:=
end
if none of the schemas R
j
, 1 j i contains a candidate key for R
then begin
i := i + 1;
R
i
:= any candidate key for R;
end
return (R
1
, R
2
, ..., R
i
)
Above algorithm ensures:
o each relation schema R
i
is in 3NF
o decomposition is dependency preserving and lossless-join
Comparison of BCNF and 3NF
It is always possible to decompose a relation into relations in 3NF and
o the decomposition is lossless
o the dependencies are preserved
It is always possible to decompose a relation into relations in BCNF and
o the decomposition is lossless
o it is difficult to achieve dependency preserving in BCNF.
BCNF strictly removes transitive dependency.