0% found this document useful (0 votes)
4 views13 pages

Week8 DBMS

The document covers database design theory, focusing on normalization, functional dependencies, and the process of finding keys and minimal covers for relation schemas. It discusses the importance of good schema design at both logical and implementation levels, and outlines informal guidelines for reducing redundancy and avoiding anomalies in database relations. Additionally, it explains the steps to determine functional dependencies and identify candidate keys through attribute closure.

Uploaded by

SUVODIP JANA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views13 pages

Week8 DBMS

The document covers database design theory, focusing on normalization, functional dependencies, and the process of finding keys and minimal covers for relation schemas. It discusses the importance of good schema design at both logical and implementation levels, and outlines informal guidelines for reducing redundancy and avoiding anomalies in database relations. Additionally, it explains the steps to determine functional dependencies and identify candidate keys through attribute closure.

Uploaded by

SUVODIP JANA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Week 8: Database Design Theory

 Normalization and Informal Guidelines


 Functional Dependency
 Finding Key
 Attribute Closure
 Minimal Cover

1
Normalization:
Good relation schema design
Goodness of relation schema can be done at
⚫ Logical level
⚫ Implementation level
The first is the logical (or conceptual) level—how users interpret the relation schemas
and the meaning of their attributes. Having good relation schemas at this level enables users
to understand clearly the meaning of the data in the relations, and hence to formulate their
queries correctly. The second is the implementation (or physical storage) level—how the
tuples in a base relation are stored and updated. This level applies only to schemas of base
relations—which will be physically stored as files—whereas at the logical level we are
interested in schemas of both base relations and views (virtual relations).

Database design may be performed using two approaches: bottom-up or top-down. A bottom-
up design methodology which considers the basic relationships among individual attributes
as the starting point and uses those to construct relation schemas. This approach is not very
popular in practice because it suffers from the problem of having to collect a large number of
binary relationships among attributes as the starting point. For practical situations, it is next to
impossible to capture binary relationships among all such pairs of attributes. In contrast, a
top-down design methodology (also called design by analysis) starts with a number of
groupings of attributes into relations that exist together naturally, for example, on an invoice,
a form, or a report. The relations are then analysed individually and collectively, leading to
further decomposition until all desirable properties are met. The database design theory
discussed in this week is applicable to both the top-down and bottom-up design approaches
but is more appropriate when used with the top-down approach.

Normalization : can be defined as


Taking relations through normal forms
Making a good schema design
Refining the database

2
Informal Design Guidelines:
To start with normalization we will see informal guidelines that may be used as a measure to
determine the quality of relation schema design:
■ Making sure that the semantics of the attributes is clear in the schema
■ Reducing the redundant information in tuples
■ Reducing the NULL values in tuples
■ Disallowing the possibility of generating spurious tuples

Making sure that the semantics of the attributes is clear in the schema:
Employee and Department or Employee and project are combined in the example. If we
make ER diagram each of this will be separate entity. When we map ER to relational schema
it will be sperate relations.
Ex.Emp_dept
(a) ssn, ename, add, dno, dname,mgrssn
Ex. Emp_proj
(b) ssn, pno, hours, name, pname, plocation

Guidelines 1:
Design a relation schema that do not combine attributes from multiple entity types &
relationship types.

Reducing the redundant information in tuples:


 when two entities attributes are mixed.
Update anomalies occur
i) Insertion – both employee and department have to be inserted
ii) deletion – if last emp of dept is deleted, department information is lost
iii) Modification – If mgr ssn is modfified in 3rd row other rows should also be
modified.
Ex. Table - Emp_dept

ssn Name age dno dname mgrssn

101 Kumar 25 1 Site Null

102 Ram 28 1 Site 101

103 John 23 2 Scse 104

104 Siva 29 2 Scse Null

105 Nikhil 30 3 Smbs 104

3
Guidelines 2: Design the base relation schemas so that no insertion, deletion, or modification
anomalies are present in the relations. If any anomalies are present, note them clearly and
make sure that the programs that update the database will operate correctly.
⚫ Tables are put together to reduce storage space.
⚫ create views for base relations with joins for easy querying.

Reducing the NULL values in tuples:


Null values
⚫ Waste storage
⚫ Join operations at logical level
Guidelines 3:
Avoid placing attributes in a base relation whose values may frequently be null.
Disallowing the possibility of generating spurious tuples:

Avoiding generation of spurious tuples:


Consider the relations emp_loc and
emp_proj (employee location and
Employee project). Common attribute
is ploc(project location).

4
When joining emp_proj and emp_loc by ploc(only common attribute) the table generated is
given above. Look at the values, SSN 11 has name Ram and project PA, SSN 13 has name
Ram and PC. If SSN is different name should be different , since we have only three SSN and
three distinct names in the tables. 13,RAM… must be a wrong tuple called spurious tuple- a
tuple that is not in the database but generated by wrong joins.

Guideline 4:
Design relation schemas so that they can be joined with equality conditions on attributes that
are either PRIMARY KEYS or FOREIGN KEYS.
1. Do not have common attributes that are either PK or FK.

2. If such relations are unavoidable do not join them, because it produces spurious
tuples.

Functional Dependencies:
Functional dependency is a constraint between 2 sets of attributes from the database.
Functional dependency is a property of the semantics of the attributes.

Database designers specify the semantics by Functional dependency.


If we have an FD x->y ( x determines y ), then
t1[x]=t2[x] => t1[y]=t2[y]
x, y are set of attributes.

Functional Dependencies are given by a set of Inference Rules:


William W. Armstrong developed set of inference rules used to infer all the functional
dependencies on a relational database called Armstrong’s axioms.

Reflexivity: if Y is a subset of X, then X determines Y

if Y ⊆ X , X  Y

Augmentation: also known as a partial dependency, says if X determines Y, then XZ


determines YZ for any Z
if X  Y , XZ  YZ
Transitivity: X determines Y, and Y determines Z, then X must also determine Z
if X  Y & Y Z, X  Z
Union: X determines Y and X determines Z then X must also determine Y and Z

5
if X  Y & X Z, X  YZ

Decomposition: if X determines Y and Z, then X determines Y and X determines Z


separately.

Either all or few of the rules can be applied to set of dependencies to find the minimal
dependencies.

Key of a Relation:

Having seen what is normalization and functional dependencies , let us see how to normalize
a relation with set of given FDs. Before we see normal forms, we have to find the key of the
relation with given FDs.

Finding the Key of the Relation:


1. Let S be set of FD

2. For every Si in S take the LHS attribute and find the closure
3. If an attribute determines all the attributes in the relation it is the PK
4. If none of LHS attribute determine all, check for combination of LHS
attributes

Finding closure of an attribute:


Algorithm to compute X+, the closure of X under FD
X+ := X;
while (changes to X+) do
for each X → Y in FD do
Begin

if X ⊆ X+ then

X+ := X+ ∪ Y

End

Initially create X+ , the closure of X. It‘s the LHS of a given FD. First add X , Add the
attribute determined by X. Keep adding the attributes determined by attributes added in X+
until there is no further determination.

6
Example Finding the key of the Relation:
Let us take the relation Student department and set of FD
Stud_dept (Reg, name, prog, dcode, dname, dean)

FD={ Reg  name ; Reg  prog; Reg  dcode; dcode  dname, dean}
We have only Reg and dcode on the LHS of FDs. Find closure of reg and dcode.

{Reg}+ = Reg [ first add X in X+ ]


{Reg}+ = Reg ,name [ add Y in X+ determined by X ]
{Reg}+ = Reg , name, prog [ add other Y in X+ determined by X ]
{Reg}+ = Reg , name, dcode [ add other Y in X+ determined by X ]
{Reg}+ = Reg , name, dcode, dname, dean [ add other attributes determined by Y ]

{dcode}+ = dcode [ first add X ]

{dcode}+ = dcode, dname, dean [ add Y determined by X]


All attributes are determined by Reg, so it is the key of the relation.
lossless join property.

Minimal Cover
Steps to find Minimal Cover
 Singleton attributes in Right Hand Side [RHS]
 Identify Extraneous Attributes and remove it
 Remove redundant dependencies

Step 1: Singleton in RHS


 AB -> CD
 The above functional dependency should be decomposed to singleton attributes in
the RHS as below.
 AB -> C and
 AB -> D
Attribute Closure
Example 1: Attribute Closure for one key attribute
Consider the Functional Dependencies for R(A, B, C)
A -> B
7
B -> C
A+ = Step1: AB [Since A -> B]
Step 2: ABC [Since B -> C]
B+ = Step1: BC [Since B -> C]
So,
A+ = ABC
B+ = BC
 Note: If an attribute closure gives all the Attributes in the given relation, that attribute will
be a Candidate Key.
 From the given set of functional dependencies, A is a Candidate Key.
Example 2: Attribute Closure for more than one key attribute
Consider the Functional Dependencies for R(A, B, C,D,E,F)
AB -> C
AD -> E
B -> D
AF -> B
B -> C
AB+ = ABCDE [Since AB->C, B->D, AD->E]
AD+ = ADE [Since AD->E]
B+ = BCD [Since B->C, B->D]
AF+ = AFBCDE [Since AF->B, B->C,B->D, AB->C, AD->E]

 From the given set of functional dependencies, AF is a Candidate Key.


 Example 3: Key Attribute closure without candidate Key
Consider the Functional Dependencies for R(A, B, C,D,E)
A -> C
C-> B
D -> E
A+ = ACB [Since A->C, C->B]
C+ = CB [Since C->B]
D+ = DE [Since D->E]

8
Since none of the key attribute closure have issued all the attributes, try finding some other
closure by combining the key attributes which may issue all the attributes in the
AD+ = ADCEB [Since A->C, D->E, C->B]
 From the given set of functional dependencies, AD is a Candidate Key.
 Example 4: Key Attribute closure without candidate Key
Consider the Functional Dependencies for R(A, B, C,D,E)
A -> B
C-> D
BC -> E
A+ = AB [Since A->B]
C+ = CD [Since C->D]
BC+ = BCED [Since BC->E, C->D]
Since none of the key attribute closure have issued all the attributes, try finding some other
closure by combining the key attributes which may issue all the attributes in the
AC+ = ACBDE [Since A->B, C->D, BC->E]
 From the given set of functional dependencies, AC is a Candidate Key.
Step 2: Removing Extraneous Attributes
 If an attribute doesn’t any meaning to the functional dependency, we say it is extraneous
and remove it
 Consider the functional dependencies
A -> B
AB -> C
D -> AC
D -> E
A -> B
A -> C
D -> A
D->C
D -> E
If an LHS has more than one attribute, check whether there exists an extraneous
(Extra/Unwanted) attribute. If so, remove it.

9
LHS which have 2 attributes is AB -> C
A+ = ABC [Since A->B, AB-> C]
B+ = B [Reflexivity]
If an attribute Closure gives only its own attribute by satisfying Reflexivity, that attribute
in the functional dependency is Extraneous.
So, B is Extraneous in AB-> C Implies A->C
New FDs are A->B, A->C, D-AC, D->E
Step 3: Removing Redundant Functional Dependencies
Finding Redundancy Dependency and Minimal Cover – Ex 1
 Now we have to identify the redundant dependencies from the below
New FDs
A->B
A->C
D->AC
D->E
Applying Singleton to RHS
A->B
A->C
D->A
D->C
D->E

1. Remove A->B and find the attribute closure for A


A+ = AC [Since A->C] – Here if we don’t consider A->B, B cannot be found in A+. So A-
>B cannot be a redundant dependency.
2. Remove A->C and find the attribute closure for A
A+ = AB [Since A->B] – Here if we don’t consider A->C, C cannot be found in A+.
So A-> C cannot be a redundant dependency.
3. Remove D->A and find the attribute closure for D
D+ = DCE [Since D->C, D->E] – Here if we don’t consider D->A, A cannot be found in
D+. So D->A cannot be a redundant dependency.

10
4. Remove D->C and find the attribute closure for D
D+ = DAEBC [Since D->C, ] – Here if we don’t consider D->C, Could be found in D+. So
D->C is the redundant dependency and should be removed.
5. Remove D->E and find the attribute closure for D
D+ = DACB [Since D->A, A->C, A->B] – Here if we don’t consider D->E, E cannot be
found in D+. So D->E cannot be a redundant dependency.
So, Minimal Cover will be after removing
a) Extraneous Attributes
b) Redundant Dependencies
Minimal Functional Dependencies are
A -> B
A -> C
D -> A
D -> E
Example 2
 Consider the Functional Dependencies
A -> B
B -> C
A -> C
Remove A -> B and find attribute Closure for A
A+ = AC [B is not issued – Not redundant]
Remove B -> C and find attribute Closure for B.
B+ = B [C is not issued – Not redundant]
Remove A -> C and find attribute Closure for A.
A+ = ABC [A->B, B->C. C is issued – So, Redundant]
Final FDs without redundancy are
A->B and B->C
Example 3:
 R(A,B,C,D,E)
F = {A->D, BC->AD, C->B, E->A, E->D}
Step1: Singleton RHS

11
Step2: Remove Extraneous Attribute
Step3: Redundant Dependency
Step1: Singleton RHS
A->D
BC->A
BC->D
C->B
E->A
E->D

Step2: Remove Extraneous Attribute


A->D
BC->A
BC->D
C->B
E->A
E->D
Consider BC->A and BC->D
B+ = B
C+=BCAD so B is Extraneous in BC->A and BC->D. Remove it.
After removing Extraneous in LHS, FD =>
A->D
C->A
C->D
C->B
E->A
E->D
Step3: Remove Redundant Dependency
A->D
C->A
C->D

12
C->B
E->A
E->D
Remove A->D, A+ = A – Not Redundant [D not arrived]
Remove C->A, C+ = CDB – Not Redundant [A not arrived]
Remove C->D, C+ = CABD – Redundant [D arrived]
New
F= {A->D,
C->A,
C->B,
E->A,
E->D}
Remove C->B, C+ = CAD – Not Redundant [B not arrived]
Remove E->A, E+ = ED – Not Redundant [A not arrived]
Remove E->D, E+ = EAD – Redundant [D arrived]
Final – Minimal Cover
F= {A->D,
C->A,
C->B,
E->A,}
EC+ = ECADB
EC will be the candidate Key

13

You might also like