0% found this document useful (0 votes)
16 views69 pages

8 Part A Relational Database Design

The document discusses relational database design, focusing on normalization and functional dependencies to eliminate redundancy and anomalies. It outlines various normal forms, including 1NF, 2NF, and 3NF, and provides examples of functional dependencies and their implications on database structure. Additionally, it presents a schema design exercise to illustrate the application of normalization principles.

Uploaded by

f20230371
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views69 pages

8 Part A Relational Database Design

The document discusses relational database design, focusing on normalization and functional dependencies to eliminate redundancy and anomalies. It outlines various normal forms, including 1NF, 2NF, and 3NF, and provides examples of functional dependencies and their implications on database structure. Additionally, it presents a schema design exercise to illustrate the application of normalization principles.

Uploaded by

f20230371
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Relational Database Design

Pno. Projtitle Rollno Name Class mks

10 Tourism RM2007022 Kunal SE 25


RM2007025 Tirth
RF2007023 Pooja
30 Hospital RM2006011 Yazad TE 50
RM2006014 Sahil
20 Content RF2005073 Pooja BE 100
RM2007022 Kunal SE
40 RTO
Each project has a set of students
Proj Projtitle Rollno Stud Class mks
no. name
10 tourism RM2007022 Kunal SE 25
10 tourism RM2007025 Tirth SE 25
10 toursm RF2007023 Pooja SE 25
30 Hospital mgmt RM2006011 Yazad TE 50
30 Hospital mgmt RM2006014 Sahil TE 50
20 Content mgmt RF200573 Pooja BE 100
RM2007022 Kunal SE
40 RTO
Redundancy may cause spelling mistake and thus data
inconsistency
Proj Projtitle Rollno Stud Class mks
no. name
10 tourism RM2007022 Kunal SE 25
10 tourism RM2007025 Tirth SE 25
10 toursm RF2007023 Pooja SE 25
30 Hospital mgmt RM2006011 Yazad TE 50
30 Hospital mgmt RM2006014 Sahil TE 50
RF2005073 Pooja BE 100
RM200722 Kunal SE
25 RTO
Redundant data creates difficulty in searching
Proj Projtitle Rollno Stud Class mks
no. name
10 tourism RM2007022 Kunal SE 25
10 tourism RM2007025 Tirth SE 25
10 tourism RF2007023 Pooja SE 25
Manwani
30 Hospital mgmt RM2006011 Yazad TE 50
30 Hospital mgmt RM2006014 Sahil TE 50
RF2005073 Pooja BE 100
Mehta
RM2007022 Kunal SE
25 RTO
Primary key is not defined, hence data can repeat
Proj Projtitle Rollno Stud Class mks
no. name
10 tourism RM2007022 Kunal SE 25
10 tourism RM2007025 Tirth SE 25
10 tourism RF2007023 Pooja SE 25
30 Hospital mgmt RM2006011 Yazad TE 50
30 Hospital mgmt RM2006014 Sahil TE 50
RF2005073 Pooja BE 100
RM2007022 Kunal SE
25 RTO
Primary key is not defined, hence data can be null

Projno. Projtitle Rollno Stud Class mks


name
10 tourism RM2007022 Kunal SE 25
10 tourism RM2007025 Tirth SE 25
10 tourism RF2007023 Pooja SE 25
30 Hospital RM2006011 Yazad TE 50
mgmt
30 Hospital RM2006014 Sahil TE 50
mgmt
NULL NULL RF2005073 Pooja BE 100
NULL NULL RM2007022 Kunal SE Null
25 RTO NULL NULL NUll Null
Insertion anomaly
Proj Projtitle Rollno Stud Class mks
no. name
10 tourism RM2007022 Kunal SE 25
10 tourism RM2007025 Tirth SE 25
10 tourism RF2007023 Pooja SE 25
30 Hospital mgmt RM2006011 Yazad TE 50
30 Hospital mgmt RM2006014 Sahil TE 50
Null Null RF2005073 Pooja BE 100
Null Null RM2007022 Kunal SE Null
25 RTO Null Null Null Null
Deletion anomaly
Proj Projtitle Rollno Stud Class mks
no. name
10 tourism RM2007022 Kunal SE 25
10 tourism RM2007025 Tirth SE 25
10 tourism RF2007023 Pooja SE 25
30 Hospital mgmt RM2006011 Yazad TE 50
30 Hospital mgmt RM2006014 Sahil TE 50
RF2005073 Pooja BE 100
RM2007022 Kunal SE
25 RTO
Update anomaly
Proj Projtitle Rollno Stud Clas mks
no. name s
10 tourism RM2007022 Kunal SE 25
10 tourism RM2007025 Tirth SE 25
10 tourism RF2007023 Pooja SE 25
30 Hospital admn RM2006011 Yazad TE 50
30 Hospital admn RM2006014 Sahil TE 50
RF2005073 Pooja BE 100
RM2007022 Kunal SE
25 RTO
Dependency
Proj Projtitle Rollno Stud Class mks
no. name
10 tourism RM2007022 Kunal SE 25
10 tourism RM2007025 Tirth SE 25
10 tourism RF2007023 Pooja SE 25
30 Hospital mgmt RM2006011 Yazad TE 50
30 Hospital mgmt RM2006014 Sahil TE 50
RF2005073 Pooja BE 100
RM2007022 Kunal SE
25 RTO
Normalization & Normal form
• Normalization
It is a tool to validate and improve a logical design, so that
it satisfies certain conditions that avoid redundancy and
anomalies like insert anomaly, delete anomaly, update
anomaly.
It is based on the analysis of functional dependencies.

• Normal form
It is a state of relation that results by applying simple
rules regarding functional dependencies (or relationships
between attributes) to that relation.
Types of Normal forms
• First normal form
• Second normal form
• Third normal form
• Boyce-Codd normal form
• Fourth normal form
• Fifth normal form
Functional Dependency
• A kind of Unique-value constraint.
• Knowledge of these constraints vital to eliminate
redundancy in database.
• An FD on a relation R is of the form X  Y
(X functionally determines Y) where X, Y are sets of
attributes of R, such that whenever two tuples in R
have same values on all the attributes of X , they must
have same values on all the attributes of Y .
• Given X  Y where Y is the set of all attributes of R
then X is a key for R.
Example FDs
• Consider the following relational schema
student_t(st_id, name, birth_date,
gender, hostel_no, room_no, sem,
year)
Example FDs
st_id name birth_date gender hostel_no room_no sem year

2009A7PS001 arun 01/12/1991 m BH-1 1 1 2011-12


2009A7PS001 arun 01/12/1991 m BH-1 1 2 2011-12
2009A7PS001 arun 01/12/1991 m BH-3 10 1 2010-11
2009A7PS001 arun 01/12/1991 m BH-4 1 2 2010-11
2009A7PS011 arti 11/10/1991 f CH-4 5 1 2010-11
2009A7PS011 arti 11/10/1991 f CH-4 1 2 2010-11
2009A7PS011 arti 11/10/1991 f CH-5 5 1 2011-12
Example FDs
Then following FDs hold on ‘student_t’
st_id  name
st_id  birth_date
st_id  gender
or
st_id  name birth_date gender

But st_id  hostel_no, st_id  room_no,


st_id  sem and st_id  year
do not hold on ‘student_t’
Example FDs
• Consider the following relational schema
course_t(course_no, title, units, ltp, sem,
year)
Example FDs
Course_no title units ltp sem year

CS C210 Operating Systems 3 300 1 2010-11


CS C210 Operating Systems 3 300 2 2010-11
CS C210 Operating Systems 3 300 1 2011-12
CS C352 Database Systems 3 300 2 2010-11
CS C352 Database Systems 3 300 1 2011-12
CS C352 Database Systems 3 300 2 2011-12
Example FDs
Then following FD’s hold on ‘course_t’
course_no  title
course_no  units
course_no  ltp
or
course_no  title units ltp

But course_no  sem and


course_no  year
do not hold on ‘course_t’
Example FDs
• Consider the following relational schema
st_marks_t(st_id, st_name, course_no,
course_title, section_no, test_no,
marks)
Example FDs
Then following FD’s hold on ‘st_marks_t’

st_id  st_name
course_no  course_title
st_id course_no section_no, test_no, marks

But st_id  course_title and


course_no  st_name
do not hold on ‘st_marks_t’

Note: Is the above one correct? If not what is the


error?
Example FDs
The following is the correct one.
The following FDs hold on ‘st_marks_t’
st_id  st_name
course_no  course_title
st_id course_no  section_no
st_id course_no , test_no  marks
Example FDs
• Consider the following relational schema
Employee_t(emp_id, emp_name,
dept_id_w, dep_name_w, dept_id_h)
Example FDs
emp_id emp_name dept_id_w dept_name_w dept_id_h

245 Bharat cs comp. science cs


245 Bharat math mathematics cs
300 Karan cs comp. science
301 Hari cs comp. science math
301 Hari math mathematics math
311 Bharat math mathematics

emp_id  emp_name dept_id_h


dept_id_w  dept_name_w
First normal form (1NF): Atomic values and define primary keys
Proj Projtitle Rollno fname Phone no Class mks
no.
10 tourism RM2007022 Kunal 111 SE 25

10 tourism RM2007022 Kunal 222 SE 25

10 tourism RM2007025 Tirth 333 SE 25

10 tourism RF2007023 Pooja 444 SE 25

30 Hospital RM2006011 Yazad 555 TE 50


mgmt
30 Hospital RM2006014 Sahil 666 TE 50
mgmt
RF2005073 Pooja 777 BE 100

RM2007022 Kunal 111 SE


First normal form (1NF): Atomic values and define primary keys
Rollno Phone no

RM2007022 111

RM2007022 222

RM2007025 333

RF2007023 444

RM2006011 555

RM2006014 666

RF2005073 777

RM2007022 111
Second normal form (2NF):
Functional dependency
Project table
Projno. Projtitle

10 tourism

30 Hospital
mgmt
25 RTO
Second normal form (2NF): Functional dependency
Student table
Rollno fname Class

RM2007022 Kunal SE

RM2007025 Tirth SE

RF2007023 Pooja SE

RM2006011 Yazad TE

RM2006014 Sahil TE

RF2005073 Pooja BE

RM2007022 Kunal SE
Second normal form (2NF): Functional dependency
Student-Project table
Projno. Rollno mks

10 RM2007022 25

10 RM2007025 25

10 RF2007023 25

30 RM2006011 50

30 RM2006014 50
Third Normal form(3NF): Transitive Dependency

Rollno fname Class Marks

RM2007022 Kunal SE 25

RM2007025 Tirth SE 25

RF2007023 Pooja SE 25

RM2006011 Yazad TE 50

RM2006014 Sahil TE 50

RF2005073 Pooja BE 100


Third Normal form(3NF): Transitive Dependency

Rollno Class

RM2007022 SE

RM2007025 SE Class Marks

RF2007023 SE
SE 25
RM2006011 TE
TE 50
RM2006014 TE
BE 100
RF2005073 BE
Requirements: 1NF
• Each table has a primary key: minimal set of
attributes which can uniquely identify a record

• The values in each column of a table are atomic


(No multi-value attributes allowed).

• All attributes are dependent on primary key


Requirements 2NF
A table is in second normal form if
• Its in first normal form
• It includes no partial dependencies (where an
attribute is dependent on only a part of a
primary key)
Definition 3NF
• Its in second normal form
• It contains no transitive dependencies (where
a non-key attribute is dependent on another
non-key attribute)
• If ABC then decompose the table into two
relations having columns (A,B) and (B,C).
Q. Design the following schema into best normal form
Order table (
• invoice no.,
• Date,
• Customer no.
• Customer name,
• Customer address,
• Customer city,
• Customer state,
• Item id
• Item description,
• Item quantity,
• Item price,
• Item total,
• Order total price)
Answer
• Orders( invoiceno, customer no, order
date,order total price)
• Customers(customerno, custname, address,
custcity, custstate)
• Items(itemid, itemdescr, itemprice)
• orderlist(invoiceno, itemid)
Attribute closure
• Suppose {A1,A2,….,An} is a set of attributes and S is a
set of FD’s. The closure of {A1,A2,….,An} under the
FD’s in S is the set of all attributes B such that every
relation that satisfies all the FD’s in set S also satisfies
A1,A2,….,An  B. i.e., A1,A2,….,An follows from the
FD’s of S. It is denoted as A1A2…An by
{A1,A2,…,An}+.
• Starting with the given set of attributes, we
repeatedly expand the set by adding the right sides
of FD’s as soon as we have included their left sides.
Eventually, we can not expand the set any more and
the resulting set is the closure.
Attribute closure (cont.)
• The following steps explain in more detail about the above:
step1. Let X be a set of attributes that eventually will
become the closure. First initialize X to {A1,A2,…,An}.
step2. Search for some FD B1B2…Bm  C such that all of
B1,B2,…,Bm are in the set of attributes X, but not C.
Then add C to the set X.
step3. Repeat step 2 as many times as necessary until no
more attributes can be added to X. Since X can only grow
and number of attributes of any relation schema are finite,
eventually nothing can be added to X.
step4. The set X, after no more attributes can be added to it is the
correct value of {A1,A2,…,An}+.
Attribute closure
• Let F be a set of FD’s holding on a relation R. Let X, Y be
sets of attributes of R. Then Y is said to be attribute closure
of X, denoted by X+ = Y, if X  Y ‘follows’ from F
• Algorithm
Ex: Find attribute closure of {A,B} w.r.t the FD’s:
AB  C, BC  AD, D  E, E F, AH  J
– Useful to check whether a given FD follows from given set of FD’s:
example AB  D, D  A
– Useful to check whether a given set of attributes forms key w.r.t
the FD’s
– Useful to find all FD’s that hold on a relation
Closure Test
• An easier way to test is to compute the closure
of Y, denoted Y +.
• Basis: Y + = Y.
• Induction: Look for an FD’s left side X that is a
subset of the current Y +. If the FD is X -> A,
add A to Y +.
Attribute closure(example)
• The closure of {A,B} is {A,B}+.
• Let X = {A,B}
• These two attributes are on the left side of FD, AB  C and
BC  AD are in X, so add C,D to X, i.e., X= {A,B,C,D}.
• In FD BC  E, the left side attributes B C form the subset of X,
so add E to X, i.e., X={A,B,C,D,E}.
• In FD BE  F, the left side attributes B E form subset of X, so
add F to X, i.e., x={A,B,C,D,E,F}.
• The FD AH  J, can not be used, because the left side
attributes A H do not form subset of X.
• Finally {A,B}+ = {A,B,C,D,E,F}.
Example schema
sid name serno subj cid exp-grade

1 Sam 570103 AI 520 B

23 Nitin 550103 DB 550 A

45 Jill 505103 OS 505 A

1 Sam 505103 OS 505 C


Rules of FD’s
• Assume W, X, Y, Z are sets of attributes of R.
– If Y  X then X  Y (reflexivity)
name, sid  name
– If X  Y then ZX  ZY (augmentation)
serno  subj so serno, exp-grade  subj, exp-grade
– If X  Y and Y  Z then X  Z (transitivity)
serno  cid and cid  subj
so serno  subj
– If X  Y and X  Z then X  YZ (Union)
– If X  Y and Z  Y
then X Y and X  Z (decomposition)
– If W  X and XY  Z then WY  Z (pseudotransitivity)
– If XY  ZY then XY  Z
• First three are known as Armstrong’s Axioms
Rules of FD’s (cont.)
• Reflexivity:
If {B1,B2,….,Bj}  {A1,A2,….,Ai} then
A1A2….Ai  B1B2…Bj. These are called trivial FD’s.
• Augmentation:
If A1A2….Ai  B1B2…Bj then
A1,A2,….,AiC1C2…Ck  B1B2…BjC1C2…Ck
for any set of attributes C1,C2,…,Ck.
• Transitivity:
If A1A2….Ai  B1B2…Bj and B1B2…Bj  C1C2…Ck then
A1A2….Ai  C1C2…Ck.
Rules of FD’s (cont.)
• Union:
• If A1A2….Ai  B1B2…Bj and
A1A2….Ai  C1C2…Ck then
A1A2….Ai  B1B2…BjC1C2…Ck
• Decomposition:
• If A1A2….Ai  B1B2…BjC1C2…Ck and
A1A2….Ai  B1B2…Bj then
A1A2….Ai  C1C2…Ck
Rules of FD’s (cont.)
• If W  X and XY  Z then WY  Z
If D1D2…Dl  A1A2….Ai and
A1A2…Ai B1B2…Bj  C1C2…Ck then
D1D2…Dl B1B2…Bj  C1C2…Ck.
• If XY  ZY then XY  Z
If A1A2…Ai B1B2…BjC1C2…CkB1B2…Bj
then A1A2….Ai B1B2…Bj  C1C2…Ck.
Closure of a Set of FD’s
Defn. Let F be a set of FD’s.
Its closure, F+,is the set of all FD’s:
{X  Y | X  Y is derivable from F by Armstrong’s Axioms}
Which of the following are in the closure of Student-Course FD’s?
name  name
cid  subj
serno  subj
cid, sid  subj
cid  sid
sid, serno  subj, exp-grade

Ans: Except cid  sid, all FD are in the closure of Student-Course FD

48
Equivalence of FD sets
Defn. Two sets of FD’s, F and G, are equivalent if
their closures are equivalent, F + = G +
e.g., these two sets are equivalent:
{XY  Z, X  Y} and
{X  Z, X  Y}

 F + contains a huge number of FD’s


(exponential in the size of the schema)
 Would like to have smallest “representative” FD
set

49
Minimal Cover (Canonical Cover)
Defn. A FD set F is minimal if: we express
each FD in
1. Every FD in F is of the form X  A,
simplest form
where A is a single attribute
2. For no X  A in F is:
F – {X  A } equivalent to F in a sense,
3. For no X  A in F and Z  X is: each FD is
“essential”
F – {X  A }  {Z  A } equivalent to F
to the cover
Defn. F is a minimum cover for G if F is minimal and is equivalent to G.
e.g.,
{X  Z, X  Y} is a minimal cover for
{XY  Z, X  Z, X  Y}

50
More on Closures
If F is a set of FD’s and X  Y  F +
then for some attribute A  Y, X  A  F +

Proof by counterexample.
Assume otherwise and let Y = {A1,..., An}
Since we assume X  A1, ..., X  An are in F +
then X  A1 ... An is in F + by union rule,
hence, X  Y is in F + which is a contradiction

51
Why Armstrong’s Axioms?
Why are Armstrong’s axioms (or an equivalent rule
set) appropriate for FD’s? They are:
Consistent: any relation satisfying FD’s in F will satisfy those in F +
Complete: if an FD X  Y cannot be derived by Armstrong’s axioms from
F, then there exists some relational instance satisfying F but not
XY

 In other words, Armstrong’s axioms derive all the


FD’s that should hold

52
Proving Consistency
We prove that the axioms’ definitions must be true
for any instance, e.g.:
 For augmentation (if X  Y then XW  YW):

If an instance satisfies X  Y, then:


For any tuples t1, t2 r,
if t1[X] = t2[X] then t1[Y] = t2[Y] by defn.

If, additionally, it is given that t1[W] = t2[W],


then t1[YW] = t2[YW]

53
Proving Completeness
Suppose X  Y  F + and define a relational instance
r that satisfies F + but not X  Y:
 Then for some attribute A  Y, X  A  F +

 Let some pair of tuples in r agree on X+ but disagree everywhere else:

X A X+ – X R – X+ – {A}

x1 x2 ... xn a1,1 v1 v2 ... vm w1,1 w2,1...


x1 x2 ... xn a1,2 v1 v2 ... vm w1,2 w2,2...

54
Proof of Completeness cont’d
 Clearly this relation fails to satisfy X  A and X  Y.
We also have to check that it satisfies any FD in F + .

 The tuples agree on only X + .


Thus the only FD’s that might be violated are of the form X’  Y’ where X’ 
X+ and Y’ contains attributes in
R – X+ – {A}.

 But if X’  Y’ F+ and X’  X+ then Y’  X+ (reflexivity and augmentation).


Therefore X’  Y’ is satisfied.

55
Normalize the following Relation

Student-course-professor details
Si Sn cid Cname Grad Cloc Se Pid pname
d am e c
e
1 X 31 Java A C1 1 P4 John

1 X 45 Oracle B C2 1 P6 David

5 Z 45 Oracle A C2 1 P6 David
Student details
Sid Sname

1 X

5 Z
Course details

cid Cname Cloc

31 Java C1

45 Oracle C2
Professor details

Pid pname

P4 John

P6 David
Student-course details
Sid cid Grade

1 31 A

1 45 B

5 45 A
Course-professor details
cid Sec Pid

31 1 P4

45 1 P6
BCNF(Boyce Codd Normal Form)

A FD XY is said to be trivial if Y  X


• A
BCNF(Boyce Codd Normal Form)

i.e;
If ABC and CB then
Decompose the table into relations having
columns (A,C) and (C,B).
Begin with the non-key FD StudIDStudName. This
results in the decomposition
EnrollProf(StudID,ClassID,Grade,ProfID) Stud(StudID,StudName)
Stud is BCNF, but EnrollProf is not. The FD
ClassIDProfID gives the further decomposition
Enroll(StudID,ClassID,Grade)
Prof(ClassID,ProfID)
Stud(StudID,StudName)
in which all schemas are BCNF.

We could have begun with the FD ClassIDProfID and first got


EnrollStud1( StudID, ClassID, Grade, StudName), Prof(ClassID,ProfID)
Then apply the FD StudIDStudName to EnrollStud1
and get the same BCNF tables as above
BCNF example
courseno,subjnameprofno. And profnosubjname
Q.Display the professor’s name who teaches OS.
Prob: P4 is displayed twice.
Q.List the subjects taken by prof P4.
Prob: OS is displayed twice.

Course no. Subj name Prof no.


FE Chem P1
FE Phy P5
SE OS P4
TE OS P4
SE DB P3
TE ADB P2
BCNF example
Course no. Prof no.
FE P1
FE P5
SE P4
SE P3
TE P4
TE P2
BCNF example
Prof no. Subj name
P1 Chem
P2 ADB
P3 DB
P4 OS
P5 Phy
Topics after Midsem
• Dependency Preservation
• Lossless Decomposition
• Fourth Normal Form (Multivalued Dependency)
• Fifth Normal Form (Projection-Join Normal Form)

You might also like