0% found this document useful (0 votes)
39 views

Relational Database Design

Here are the key points about multivalued dependencies (MVDs) and fourth normal form (4NF): 1) An MVD X —>> Y specifies that each X value is associated with a set of Y values, not just a single Y value. 2) The EMP relation has two MVDs: - ENAME —>> PNAME: Each employee name is associated with a set of project names. - ENAME —>> DNAME: Each employee name is associated with a set of department names. 3) To satisfy 4NF, relations must not have non-trivial MVDs. The EMP relation violates this since it has the two non-trivial MVDs. 4

Uploaded by

Tanisha Rathod
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Relational Database Design

Here are the key points about multivalued dependencies (MVDs) and fourth normal form (4NF): 1) An MVD X —>> Y specifies that each X value is associated with a set of Y values, not just a single Y value. 2) The EMP relation has two MVDs: - ENAME —>> PNAME: Each employee name is associated with a set of project names. - ENAME —>> DNAME: Each employee name is associated with a set of department names. 3) To satisfy 4NF, relations must not have non-trivial MVDs. The EMP relation violates this since it has the two non-trivial MVDs. 4

Uploaded by

Tanisha Rathod
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Relational Database Design

Relational database design: The grouping of attributes to


form "good" relation schemas

Two levels of relation schemas:


The logical "user view" level
The storage "base relation" level

Criteria for "good" base relations:


•Discuss informal guidelines for good relational design
•Discuss formal concepts of functional dependencies
and normal forms 1NF 2NF 3NF BCNF
There are two popular approaches for designing the db
. Top down design
. Bottom up design
ER modeling technique is called Top down approach
it involves
i) Identifying entities and their attributes
ii) Identifying the relationship between entities
iii) Draw the ER diagram
iv) Mapping diagrams to the tables
[

Normalization is the bottom up approach. It is step by


step decomposition of complex records into simple records.
Normalization controls the redundancy and removes
inconsistency and update anomalies
Normalization is based on the functional dependency
and primary key

Normalization: The process of decomposing unsatisfactory


"bad" relations by breaking up their attributes into smaller
relations

Normal form: Condition using keys and FDs of a relation to


certify whether a relation schema is in a particular normal
form
Informal design guidelines for relation schemas

1) Semantics of the relation attributes


2) Reducing the redundant values in tuples
3) Reducing the null values in tuples
4) Disallowing the possibility of generating spurious
tuples
Semantics of the Relation Attributes

GUIDELINE 1: Informally, each tuple in a relation


should represent one entity or relationship instance.

⚫ Attributes of different entities (EMPLOYEEs,


DEPARTMENTs, PROJECTs) should not be
mixed in the same relation
⚫ Only foreign keys should be used to refer to other
entities
⚫ Entity and relationship attributes should be kept
apart as much as possible.
Redundant Information in Tuples and Update
Anomalies
GUIDELINE 2:

Mixing attributes of multiple entities may cause


problems
Information is stored redundantly wasting storage
Problems with update anomalies

Insertion anomalies
Deletion anomalies
Modification anomalies
Insert Anomaly: Cannot insert a project unless an
employee is assigned to .
Inversely - Cannot insert an employee unless an
he/she is assigned to a project.

Delete Anomaly: When a project is deleted, it will


result in deleting all the employees who work on that
project. Alternately, if an employee is the sole
employee on a project, deleting that employee would
result in deleting the corresponding project
⚫ Update Anomaly: Changing the name of project
number P1 from “Billing” to “Customer-
Accounting” may cause this update to be made for
all 100 employees working on project P1.

⚫ GUIDELINE 2: Design a schema that does not


suffer from the insertion, deletion and update
anomalies. If there are any present, then note them
so that applications can be made to take them into
account
If a database design is not perfect, it may contain anomalies, which
are like a bad dream for any database administrator. Managing a
database with anomalies is next to impossible.

Update anomalies − If data items are scattered and are not linked
to each other properly, then it could lead to strange situations. For
example, when we try to update one data item having its copies
scattered over several places, a few instances get updated properly
while a few others are left with old values. Such instances leave the
database in an inconsistent state.

Deletion anomalies − We tried to delete a record, but parts of it


was left undeleted because of unawareness, the data is also saved
somewhere else.

Insert anomalies − We tried to insert data in a record that does not


exist at all.
Null Values in Tuples

GUIDELINE 3: Relations should be designed such


that their tuples will have as few NULL values as
possible

Reasons for nulls:


attribute not applicable or invalid
attribute value unknown (may exist)
value known to exist, but unavailable
Spurious Tuples

GUIDELINE 4: The relations should be designed to


satisfy the lossless join condition. No spurious tuples
should be generated by doing a natural-join of any
relations.

There are two important properties of decompositions:


(a) non-additive or losslessness of the corresponding
join
(b) preservation of the functional dependencies.
Functional dependency
it’s a constraint between two set of attributes
from the database.

A F.D denoted by X-> Y between two sets of


attributes x and y that are subsets of R specifies a
constraint on the possible tuples that can form a
relation state r of R
The constraint is that, for any two tuples t1 & t2 in r
t1[x]=t2[x]
t1[y]=t2[y]
R(X,Y)
X Y
t1 10 d1
t2 10 d1

There is a FD from X to Y or Y is FD on X
FD=> Functional dependency or f.d
X=> L.H.S
Y=> R.H.S
Full Functional dependency
Partial Functional dependency
Transitive dependency

Full Functional dependency


e.g., Eno,Pno-> Hours
Partial Functional dependency
e.g., Eno,Pno->Ename
Transitive dependency
e.g., Eno->Dno
Dno->Dname Eno->Dname
Given a set of FDs F, we can infer additional FDs that
hold whenever the FDs in F hold

Armstrong's inference rules:


IR1. (Reflexive) If Y subset-of X, then X -> Y
IR2. (Augmentation) If X -> Y, then XZ -> YZ
(Notation: XZ stands for X U Z)
IR3. (Transitive) If X -> Y and Y -> Z, then X -> Z

Some additional inference rules that are useful:


IR4. (Decomposition) If X -> YZ, then X -> Y and X -> Z
IR5. (Union) If X -> Y and X -> Z, then X -> YZ
IR6. (Psuedotransitivity) If X -> Y and WY -> Z, then
WX -> Z
Trivial , Non trivial

Trivial − If a functional dependency (FD) X → Y


holds, where Y is a subset of X, then it is called a
trivial FD. Trivial FDs always hold.

Non-trivial − If an FD X → Y holds, where Y is not a


subset of X, then it is called a non-trivial FD.

Completely non-trivial − If an FD X → Y holds,


where x intersect Y = Φ, it is said to be a completely
non-trivial FD.
Candidate key
If a relation schema has more than one key, each is called
a candidate key. One of the candidate keys is arbitrarily
designated to be the primary key, and the others are called
secondary keys.
Prime and Non prime attribute
A Prime attribute must be a member of some candidate
key
A Nonprime attribute is not a prime attribute—that is, it is
not a member of any candidate key.
Normalization of data is a process of analyzing the
given relation schemas based on their FD &
primary keys to achieve the desirable properties
1) Minimizing redundancy
2) Minimizing the insertion , deletion and
modification anomalies

Normal forms
1NF, 2NF, 3NF, BCNF(Boyce Codd Normal
Form),4NF and 5NF
1NF- is based on primary key and atomic values and there
must be no composite attributes, multivalued attributes and
relation with in relation.

Composite attribute

Eno Address Ename


Fname Lname

Eno Address Fname Lname


Multivalued Attribute
Multivalued
Dno Dname Dlocation
Attribute

Dno Dname Dno Dlocation

Relation with in Relation

Eno Ename Addr Pno Pname

Eno Ename Eno Pno Pname


2NF - There is no partial dependency.

It is based on the concept of full functional dependency and


non key attribute should be fully dependent on the key
attribute.

A F.D X->Y if fully F.D

Def: A rs R is in 2NF if every non prime attribute A in R is


full FD on the primary key of R
Eg.
R={eno, pno, hours, ename, pname, plocation}
Given functional dependency

FD = {{eno,pno}-> hours,
eno->ename
pno->pname, plocation}

R1={eno,pno,hours}
R2 = {eno,ename}
R3={pno,pname,plocation}
now all the relations R1, R2 and R3 are in full functional
dependency.
3NF-
It is based on the concept of transitive dependency
Def: A rs R is in 3NF if it satisfies 2 NF and no non
prime attribute of R is transitively dependent on the
primary key

Def: A rs R is in 3NF if, When ever a non trivial FD


X-> A holds in R, either
a) X is a super key of R (or)
b) A is a prime attribute of R
Eg:
R={eno, ename, address,dno,dname}
Given functional dependency
F = {eno -> ename,address,dno
dno -> dname}

R1={eno,ename,address,dno} R2 = {dno,dname}
BCNF(Boyce codd Normal Form)
Def:
A rs R is in BCNF if when ever a non trivial FD
X-> A holds in R, then X is a super key of R
Closure of a Set of Functional
Dependencies
Closure of a Set of Attributes
Redundancy of FDs
Equivalence of sets of FDs
⚫ Two sets of FDs E and F
⚫ F is said to cover E if every FD in E is also in
closure of F
⚫ E and F are equivalent
⚫ if E covers F and F covers E
⚫ E+ = F+
Equivalence of FD
Equivalence of F.D: If the two set of Functional dependency F1 and
F2, defined by data base designers for the same problem domain.
They are considered to be equivalent if closure of F1=Closure of
F2.
Ex. Consider the below two relations F and G
F={B->CDE,AD->E,B->A} G={B->CD,B->ABC,AD->E}
Solution:
➢ Take G and compute closure for B+, AD+ from F.
X=B=BCD=BCAD=BCDAE=ABCDE
X=AD=ADE
➢ Take F and compute closure for B+, AD+ from G
X=B=BCDE=ABCDE=ABCDEE=ABCDE
X=AD=ADE
F+ = G+
Canonical Cover
Example of Computing a
Canonical Cover
Finding Keys
3. Multivalued Dependencies and Fourth Normal Form
(a) The EMP relation with two MVDs: ENAME —>> PNAME and ENAME —>> DNAME.

(b) Decomposing the EMP relation into two 4NF relations EMP_PROJECTS and EMP_DEPENDENTS.
Multivalued Dependencies and Fourth Normal Form
Definition:
⚫ A multivalued dependency (MVD) X —>> Y specified on
relation schema R, where X and Y are both subsets of R,
specifies the following constraint on any relation state r of R: If
two tuples t1 and t2 exist in r such that t1[X] = t2[X], then two
tuples t3 and t4 should also exist in r with the following
properties, where we use Z to denote (R 2 (X υ Y)):
⚫ t3[X] = t4[X] = t1[X] = t2[X].
t3[Y] = t1[Y] and t4[Y] = t2[Y].
t3[Z] = t2[Z] and t4[Z] = t1[Z].
⚫ An MVD X —>> Y in R is called a trivial MVD if
(a) Y is a subset of X, or (b) X υ Y = R.
Multivalued Dependencies and Fourth Normal Form
Definition:
⚫ A relation schema R is in 4NF with respect
to a set of dependencies F (that includes
functional dependencies and multivalued
dependencies) if, for every nontrivial
multivalued dependency X —>> Y in F+, X
is a superkey for R.
Multivalued Dependencies and Fourth Normal Form
Decomposing a relation state of EMP that is not in 4NF.
(a) EMP relation with additional tuples.
(b) Two corresponding 4NF relations EMP_PROJECTS and EMP_DEPENDENTS.
4. Join Dependencies and Fifth Normal Form (1)
Definition:
⚫ A join dependency (JD), denoted by JD(R1, R2, ..., Rn),
specified on relation schema R, specifies a constraint on the
states r of R. The constraint states that every legal state r of R
should have a non-additive join decomposition into R1, R2, ...,
Rn; that is, for every such r we have
* (ΠR1(r), Π R2(r), ..., ΠRn(r)) = r

⚫ A join dependency JD(R1, R2, ..., Rn), specified on relation


schema R, is a trivial JD if one of the relation schemas Ri in
JD(R1, R2, ..., Rn) is equal to R.
Join Dependencies and Fifth Normal Form (2)
Definition:
⚫ A relation schema R is in fifth normal form (5NF) (or
Project-Join Normal Form (PJNF)) with respect to a
set F of functional, multivalued, and join dependencies
if, for every nontrivial join dependency JD(R1, R2, ...,
Rn) in F+ (that is, implied by F), every Ri is a superkey
of R.
Relation SUPPLY with Join Dependency and conversion to Fifth
Normal Form
(c) The relation SUPPLY
with no MVDs is in 4NF but not in 5NF if it has the JD(R1, R2, R3).
(d) Decomposing the relation SUPPLY into the 5NF relations R1, R2, and R3.
Steps to find Minimal Cover

• Singleton attributes in RHS


• Identify extraneous attributes and remove it
• Remove redundant dependencies

Singleton attributes in RHS

AB->CD
The above functional dependency should be
decomposed to singleton attributes in the RHS as below.
AB-> C and
AB-> D
Identify extraneous attributes and remove it

⚫ If an attribute doesn’t give any meaning to the functional


dependency, we say it as extraneous and remove it
Consider the functional dependencies
⚫ A-> B If the LHS has more than one attribute, check whether there exists an
extraneous( extra/unwanted) attribute if so, remove it.
⚫ AB-> C
⚫ D-> A LHS which have 2 attributes is AB-> C
⚫ D-> C
A+ = ABC ,
⚫ D-> E
B+ = B [Reflexivity]

If an attribute closure gives only its own attribute by satisfying


reflexivity, that attribute in the functional dependency is
extraneous.

B is extraneous in AB-> C implies A-> C


So, the FDs are A->B, A-> C, D->A, D->C, D-> E
Finding Redundant Dependency
Consider the functional dependencies
⚫ A-> B Step 2: In LHS there is no extraneous attribute
⚫ A-> C Step 3: Remove redundant dependencies
⚫ D-> A 1. Remove A-> B and find the attribute closure for A

⚫ D->C A+ =AC[here if we are not consider A-> B , B can’t be


found in A+, so A-> B can’t be a redundant dependency.
D-> E
2. Remove A-> C and find the attribute closure for A
A+ =AB[here if we are not consider A-> C , C can’t be
Step 1: Apply found in A+, so A-> C can’t be a redundant dependency.
singleton to RHS 3. Remove D-> A and find the attribute closure for D
A-> B D+ =DCE[here if we are not consider D-> A , A can’t be
A-> C found in D+, so D-> A can’t be a redundant dependency
D-> A 4. Remove D-> C and find the attribute closure for D
D-> C D+ =DAEC[here if we are not consider D-> C , C could be
D-> E found in D+, so D-> C is the redundant dependency so it should be
removed. Then the FDs are A-> B, A-C , D->A, D->E
5. Remove D-> E and find the attribute closure for D
D+ =DABC [here if we are not consider D-> E , E can’t be
found in D+, so D-> E can’t be a redundant dependency
⚫ So, Minimal cover will be after removing
⚫ a) Extraneous Attributes
⚫ b) Redundant Dependencies

⚫ Minimal Functional Dependencies are


⚫ A-> B
⚫ A-> C
⚫ D-> A
⚫ D->E
Find a Minimal Cover
⚫ R(A B C D E)
⚫ F ={ A->D,
⚫ BC-> AD,
⚫ C->B,
⚫ E->A,
⚫ E->D}
Steps:
⚫ Singleton attributes in RHS
⚫ Identify extraneous attributes and remove it
⚫ Remove redundant dependencies
R(A B C D E) ⚫ Singleton attributes
in RHS
F ={ A->D,
BC-> AD,
⚫ F={ A->D,
C->B, ⚫ BC->A,
E->A, ⚫ BC->D,
E->D} ⚫ C->B,
⚫ E->A,
⚫ E->D}
Identify extraneous attributes and remove it

⚫ F={ A->D, ⚫ F={ A->D,


⚫ BC->A, ⚫ C->A,
⚫ BC->D, ⚫ C->D,
⚫ C->B, ⚫ C->B,
⚫ E->A, ⚫ E->A,
⚫ E->D} ⚫ E->D}
Remove redundant FDs
⚫ F={ A->D, ⚫ F={ A->D,
⚫ C->A, ⚫ C->A,
⚫ C->D, ⚫ C->B,
⚫ C->B, ⚫ E->A,
⚫ E->A, ⚫ }
⚫ E->D}

You might also like