0% found this document useful (0 votes)

278 views29 pages

DBMS (R20) Unit - 4

This document provides an overview of schema refinement and normalization. It discusses the purpose of normalization to eliminate data redundancy and anomalies like insertion, update, and deletion anomalies. The document defines functional dependency and different forms of normalization like 1NF, 2NF, 3NF and BCNF. It explains concepts like closure of attributes and different types of functional dependencies such as fully functional dependency and partial functional dependency. The objectives are to discuss database anomalies, functional dependency, various normalization forms, and differentiate between normalization types.

Uploaded by

RONGALI CHANDINI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

278 views29 pages

DBMS (R20) Unit - 4

Uploaded by

RONGALI CHANDINI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Aditya College of Engineering & Technology

Aditya Nagar, ADB Road, Surampalem - 533437

DATABASE MANAGEMENT SYSTEMS

UNIT IV: Schema Refinement (Normalization)

Syllabus:
Schema Refinement (Normalization): Purpose of Normalization or schema refinement,
concept of functional dependency, normal forms based on functional dependency(1NF, 2NF
and 3 NF), concept of surrogate key, Boyce-codd normal form(BCNF), Lossless join and
dependency preserving decomposition, Fourth normal form(4NF), Fifth Normal Form (5NF).

Objectives:
After studying this unit, you will be able to:
 Discuss the different types of anomalies in a database
 State what is functional dependency
 List the different forms of normalization
 Differentiate among different types of normalization
DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

INTRODUCTION TO SCHEMA REFINEMENT

The Schema Refinement refers to refine the schema by using some technique. The best
technique of schema refinement is decomposition.
Normalization means “split the tables into small tables which will contain less number of
attributes in such a way that table design must not contain any problem of inserting,
deleting, updating anomalies and guarantees no redundancy”.
Normalization or Schema Refinement is a technique of organizing the data in the database.
It is a systematic approach of decomposing tables to eliminate data redundancy and
undesirable characteristics like Insertion, Update and Deletion Anomalies.
Redundancy: refers to repetition of same data or duplicate copies of same data stored in
different locations.
Anomalies: Anomalies refers to the problems occurred after poorly planned and normalized
databases where all the data is stored in one table which is sometimes called a flat file
database.

Anomalies or problems facing without normalization (problems due to

redundancy):

Anomalies refers to the problems occurred after poorly planned and unnormalized
databases where all the data is stored in one table which is sometimes called a flat file
database. Let us consider such type of schema

Here all the data is stored in a single table which causes redundancy of data or say
anomalies as SID and Sname are repeated once for same CID . Let us discuss anomalies one by
one.
Due to redundancy of data we may get the following problems, those are-
1.insertion anomalies : It may not be possible to store some information unless some other
information is stored as well.
2.redundant storage: some information is stored repeatedly
3.update anomalies: If one copy of redundant data is updated, then inconsistency is created
unless all redundant copies of data are updated.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 2

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

4.deletion anomalies: It may not be possible to delete some information without losing some
other information as well.
Problem in updation / updation anomaly – If there is updation in the fee from 5000 to 7000,
then we have to update FEE column in all the rows, else data will become inconsistent.

Insertion Anomaly and Deletion Anomaly- These anomalies exist only due to redundancy,
otherwise they do not exist.
Insertion Anomalies: New course is introduced C4, But no student is there who is having C4
subject.

Because of insertion of some data, It is forced to insert some other dummy data.
Deletion Anomaly:
Deletion of S3 student cause the deletion of course. Because of deletion of some data forced to
delete some other useful data.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 3

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Solutions To Anomalies : Decomposition of Tables – Schema Refinement, shown below.

Purpose of Normalization:

 Minimize the redundancy in data.

 Remove insert, update, and delete anomalies during the database activities.
 Reduce the need to organize the data when it is modified or enhanced.
 Normalization reduces a complex user view to a set of small and sub groups of fields or
relations. This process helps to design a logical data model known as conceptual data
model.

Advantages of Normalization:
1. Greater overall database organization will be gained.
2. The amount of unnecessary redundant data reduced.
3. Data integrity is easily maintained within the database.
4. The database & application design processes are much for flexible.
5. Security is easier to maintain or manage.

Disadvantages of Normalization:
1. The disadvantage of normalization is that it produces a lot of tables with a relatively
small number of columns. These columns then have to be joined using their
primary/foreign key relationship.
2. This has two disadvantages.
Performance: all the joins required to merge data slow processing & place
additional stress on your hardware.
Complex queries: developers have to code complex queries in order to merge
data from different tables.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 4

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Concept of Functional Dependency:

Functional Dependencies are fundamental to the process of Normalization i.e., Functional

Dependency plays key role in differentiating good database design from bad database designs.
A functional dependency is a “type of constraint that is a generalization of the notation of
the key”.
Functional Dependency describes the relationship between attributes (columns) in a table.
Functional dependency is represented by an arrow sign (→).
In other words, a dependency FD: “X → Y” means that the values of Y are determined by the
values of X. Two tuples sharing the same values of X will necessarily have the same values of
Y. An attribute on left hand side is known as “Determinant”. Here X is a Determinant.

Example [Identifying the FD’s]

A B C D
A1 B1 C1 D1
A1 B2 C1 D2
A2 B2 C2 D2
A2 B2 C2 D3
A3 B3 C2 D4

Case1: A →B
Here A1 belongs to B1 & B2. So A1 does not have unique value in B. So it is not in FD.
Case1: A →C
Here A1→C1 and A2, A3→C2. So A has unique values in B. So it is in FD.
Note: try to find all the possibilities. i.e., A→D, B→C, B→D, and C→D

Reasoning about functional dependencies:

Armstrong Axioms (Inference Rules ) : The term Armstrong axioms refers to the sound
and complete set of inference rules or axioms, introduced by William W. Armstrong, that is
used to test logical implication of functional dependencies.

Armstrong axioms define the set of rules for reasoning about functional dependencies and also
to infer all the functional dependencies on a relational database.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 5

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Various axioms rules or inference rules:

Primary axioms:

Secondary or derived axioms:

Closure of a Set of Attributes:

Attribute closure of an attribute set can be defined as set of attributes which can be
functionally determined from it.
The set of FD’s that is logically implied by F is called the closure of F and written as F +. And it
is defined as “If F is a set FD’s on a relation R, the F+, the closure of F by using the inferences
axioms that are not contained in F+.
Example: R (A, B, C, D) and set of Functional Dependencies are A→B, B→D, C→B then what
is the Closure of A, B, C, D?
Solution: A+ is
A+→ {A, B, D} i.e., A→B, B→D is exists and C is not FD on A. So it is eliminated.
B+→ { B, D} i.e., B→D is exists and A, C is not FD on A. So it is eliminated.
C+→ {C, B, D} i.e., C→B, B→D is exists and A is not FD on C. So it is eliminated.

The algorithm for computing the attribute closure of a set X of attributes is shown below

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 6

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Types of functional dependencies:

1. Fully Functional Dependency: A functional dependency is said to be full dependency

“if and only if the determinant of the functional dependency if either candidate key or
super key, and the dependent can be either prime or non-prime attribute”.
(OR)
Let’s take the functional dependency X → Y (i.e., X determines y). Here Y is said to be
fully determinant, if it cannot determine any subset of X.
Example: Consider the following determinant ABC → D i.e., ABC determines D but D
is not determined by any subset of A/ BC/C/B/AB i.e., BC→D, C→D, A→D
Functional dependencies are not exists. So D is Fully Functional Dependent.

2. Partial Functional Dependency: If a non-prime attribute of the relation is getting

derived by only a part of the candidate key, then such dependency is known as Partial
Dependency.
(OR)
In a relation having more than one key field, a subset of non key fields may depend on
all key fields but another subset or a particular non-key field may depend on only one
of the key fields. Such dependency is defined as Partial Dependency.
Example: Consider the following determinants AC→P, A→D, D→P. From these determinants P
is not fully FD on AC. Because, If we find A+ (means A’s Closure) A→D, D→P i.e., A→P. But we
don’t have any requirement of C. C attribute is removed completely. So P is Partially Dependent
on AC.
Under the following conditions a table cannot have partial F.D
(1) If primary key consists a single attribute
(2) If table consists only two attributes
(3) If all the attributes in the table are part of the primary key

3. Transitive Functional Dependency: If a non-prime attribute of a relation is getting

derived by either another non-prime attribute or the combination of the part of the
candidate key along with non-prime attribute, then such dependency is defined as
Transitive dependency. i.e., in a relation, there may be dependency among non-key
fields. Such dependency is called Transitive Functional Dependency.
Example: X→Y, and Y→Z then we can determine X→Z holds.
Under the following Circumstances, a table cannot have transitive F.D
(1) If table consists only two attributes
(2) If all the attributes in the table are part of the primary key.
4. Trivial Functional Dependency: It is basically related to Reflexive rule. i.e., if X is a set
of attributes, and Y is subset of X then X→Y holds.
Example: ABC→BC is a Trivial Dependency.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 7

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

5. Multi-Valued Dependency: Consider 3 fields X, Y, and Z in a relation. If for each value

of X, there is a well-defined set of values Y and Well-defined set of values of Z and set
of values of Y is independent of the set values of Z. This dependency is Multi-valued
Dependency. i.e., X →Y / Z.

Prime and non-prime attributes

Attributes which are parts of any candidate key of relation are called as prime attribute, others
are non-prime attributes.

Candidate Key:
Candidate Key is minimal set of attributes of a relation which can be used to identify a tuple
uniquely.
Consider student table: student(sno, sname,sphone,age)
we can take sno as candidate key. we can have more than 1 candidate key in a table.
types of candidate keys:
1. simple(having only one attribute)
2. composite(having multiple attributes as candidate key)

Super Key:
Super Key is set of attributes of a relation which can be used to identify a tuple uniquely.
 Adding zero or more attributes to candidate key generates super key.
 A candidate key is a super key but vice versa is not true.
Consider student table: student(sno, sname,sphone,age)
we can take sno, (sno, sname) as super key

Operations performed functional dependencies (applications of closure set of

attributes):
(1) To identify the additional F.D’s.
(2) To identify the keys.
(3) To identify the equivalences of the F.D’s
(4) To identify irreducible set (minimal set) of F.D’s or canonical forms of F.D’s or standard
form of F.D’s.

(1) To identify the additional F.D’s :

To check any F.D’s like AB can be determined from F1 or not. Complete A+ from F1 is A+
includes B also then; AB can be derived as a F.D in F1.

Examples:

1. In a schema with attributes A,B,C,D and E the following set of attributes are given
AB, AC, CDE, BD, EA. Find CDAC determines from the given FDs or
not.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 8

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Sol: Given FD is CDAC find the closure set of CD.

CD+ = CDE (∵ CD E)

= CDEA (∵ E A)
= CDEAB (∵ A B)
From the closure set the attributes AC are determined by CD so CD AC.

2. Check DA can be derived from the following FDs or not ABC, BCAD, DE,
CFB.

(2) Identification of key by using closure set as attributes:

A key attribute: An attribute that is capable of identifying all other attributes in a given table.

(i) Primary key: It is an unique value attribute in a table to enforce entity integrity and
ti identify rows in the table uniquely.
(ii) Composite Primary Key: Sometimes single attribute is not sufficient to identify
uniquely the rows in the table so, we combine 2 or more attributes to identify the
rows uniquely.
(iii) Candidate keys: Sometimes 2 or more independent attribute or attributes can be
used to identify the rows uniquely Eg :( vech no,veng no,purchase date) Either
vehicle no or vehicle engine no can be used as a key attribute then they are called as
candidate keys one of the candidate key can be elected as primary key.

Example 1: Find candidate keys for the relation R(ABCD) having following FD’s ABCD,
CA, DA.

Sol: From the given FD’s, the attribute B is key attribute because it is not in RHS of
functional dependency.

B+ = B (not a candidate key, find the combinations of B)

AB+ = ABCD (∵ AB CD)
BC+ = BCAD (∵ C A, AB CD)
BD+ = BDA (∵ D A )
CD+ = CDA (∵ D A )
AC+ = AC

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 9

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

AD+ =AD
From the above attributes AB and BC determines all attributes.
AB, BC are candidate keys.
Example 2: Find candidate keys for the relation R(ABCDE) having following FD’s ABC,
CDE, BD, EA.

Sol: From the given FD’s, no attribute is key attribute because all are in RHS of
functional dependency. So check for all attributes of LHS.

A+ = ABC (∵ A BC)
= ABCD (∵ B D)
= ABCDE (∵ CD E)
B+ = BD (∵ B D)
E+ = EA (∵ E A)
= EABC (∵ A BC)
= EABCD (∵ B D)
C + = C
D + = D
CD+ = CDE (∵ CD E)
= CDEA (∵ E A)
= CDEAB (∵ A BC)
BC+ = BCD (∵ B D)
= BCDE (∵ CD E)
= BCDEA (∵ E A)
From the above attributes A, E, CD and BC determines all attributes.
A, E, CD, BC are candidate keys.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 10

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

(3) To identify equivalence of F.D’s :

Different database designers may define different F.D’s sets from the same requirements. To
evaluate whether they are equivalent if we are able to derive all F.D’s in G from F and vice-
versa.

Find the equivalence of two sets of FDs.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 11

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Sol:

Step 1: Take set F and enclose all FD’s in G that can be derived from F.

ACD
A+ from F
=A
=AC (∵ A C)
=ACD (∵ AC D)
A  CD can be derived from F

EAH
E+ from F
=E
=EAD
=EADH
 E  AH can be derived from F

Step 2: Take set G and enclose all F.D’s in F that can be derived from
G. AC
A+ from G
=A
=ACD
A  C can be derived from G

E AD
E+ from G
=E
=EAH
=EAHCD
E  AH & E  AD can be derived from G
G and F are equivalent.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 12

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 13

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

(4) To identify the irreducible form of FD’s /canonical Form (minimal cover):
We try to minimize the functional dependency. The minimize FD should be equivalent to
original FD,
Procedure to find minimal set:
Step 1: Have single attributes on the RHS for every FD.
Step 2: Evaluate all F.D’s in step 1 for their necessity. If they are not necessary, remove them
from the list.
Step 3: Evaluate the necessity of the LHS attributes in FD’s obtained from step 2.If they are not
necessary remove from FD.
Step 4: Apply the union rule for common to LHS attribute in the FD’s obtained from step
3.Then we will get irreducible set.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 14

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Remove 4 and compute D+ from 1, 2, 4, 5&6

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 15

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Step 4:

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 16

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Normal forms based on functional dependency (1NF, 2NF and 3 NF, Boyce-
Codd normal form (BCNF), 4NF)

Normalization means “split the tables into small tables which will contain less number of attributes in
such a way that table design must not contain any problem of inserting, deleting, updating anomalies
and guarantees no redundancy”.
The evolution of Normalization theories / Steps of Normalization / Different Normal Forms
is illustrated below-
1. First Normal Form (1NF)
2. Second Normal Form (2NF)
3. Third Normal Form (3NF)
4. Boyce-Codd Normal Form (BCNF)
5. Fourth Normal Form (4NF)
6. Fifth Normal Form (5NF).

Points to be Remember
 1 NF is a mandatory NF and remaining are the optional
 If you construct E-R diagrams in to the tables, then 4 NF and 5 NF need not be applied
on the table.
 Practically applied normalization is upto 3NF and very rarely we will go beyond that.
 2 NF dealing with the partial dependencies and 3NF is dealing with transitive
dependencies.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 17

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

First Normal Form (1NF): A relation is said to in the 1NF if it is already in un-normalized
form and it satisfies the following conditions or rules or qualifications are:
1. Each attribute name must be unique.
2. Each attribute value must be single or atomic i.e., Single Valued Attributes.
3. Each row / record must be unique.
4. There is no repeating group’s.
Example: How do we bring an un-normalized table into first normal form? Consider the
following relation:

Solution: This table is not in first normal form because the [Color] column can contain
multiple values. For example, the first row includes values "red" and "green." To bring this
table to first normal form, we split the table into two tables and now we have the resulting
tables:

Second Normal Form (2NF): A relation is said to be in 2NF, if it is already in 1st NF and it
has no Partial Dependency i.e., no non-prime attribute is dependent on the only a part of the
candidate key.
(OR)
A relation is in second normal form if it satisfies the following conditions:
• It is in first normal form
• All non-key attributes are fully functional dependent on the primary key.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 18

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Note: Partial Functional Dependency: If a non-prime attribute of the relation is getting

derived by only a part of the candidate key, then such dependency is known as Partial
Dependency

Example: Consider the following relation

➔This table has a composite primary key [Customer ID, Store ID]. The non-key attribute is
[Purchase Location]. In this case, [Purchase Location] only depends on [Store ID], which is
only part of the primary key. Therefore, this table does not satisfy second normal form.
➔ To bring this table to second normal form, we break the table into two tables, and now we
have the following:

Q1 Given relation R(ABCD) and F:{ABC, BD} Decompose in into 2NF.

from the given FDs determine primary key. Necessary attributes to include in the key
are A, B (because this attributes are not in RHS of FD).
Find the closure set of AB
AB+ = ABC
= ABCD (∵ B D)
 AB is a primary key.
From the FDs BD is partially depending on AB. So decompose the table.
(D is a non-prime attribute derived by a part of the key)
ABCD

ABC BD
ABC BD

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 19

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Q2 Consider the relation R=ABCDEF and set of FDs are A FC, CD, B E Find the
key and normalize into 2NF.

Third Normal Form (3NF): A database is in third normal form if it satisfies the following
conditions:
• It is in 2NF.
• There is no transitive functional dependency
 By transitive functional dependency, we mean we have the following relationships in
the table: A is functionally dependent on B, and B is functionally dependent on C. In
this case, C is transitively dependent on A via B. and A non-key attribute is
depending on a non-key attribute.

Example: Consider the following relation.

➔ In the table able, [Book ID] determines [Genre ID], and [Genre ID] determines [Genre Type].
Therefore, [Book ID] determines [Genre Type] via [Genre ID] and we have transitive
functional dependency, and this structure does not satisfy third normal form.
➔ To bring this table to third normal form, we split the table into two as follows:

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 20

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Q1 Given relation R(ABCDE) and F:{ABC, BD, DE} Decompose in into 3NF.
from the given FDs determine primary key. Necessary attributes to include in the key
are A, B (because this attributes are not in RHS of FD).
Find the closure set of AB
AB+ = ABC
= ABCD (∵ B D)
= ABCDE (∵ D E)
 AB is a primary key.
From the FDs BD is partially depending on AB. So decompose the table.
(D is a non-prime attribute derived by a part of the key)
B+ = BDE

ABCD
B+
ABC BDE
ABC BD, DE
 table is in 2NF but not in 3NF. Because DE is transitive dependency.
(No non-key attribute should determining a non-key attribute)
D+ = DE
BDE
D+
BD DE
BD DE
 Table is 3NF.
The relations after decomposing into 3NF.
R1: ABC
R2: BD
R3: DE

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 21

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Q2 Given relation R=ABCDEFGHIJ and the set of FDs are AB C, ADE, BF, FGH,
D IJ Decompose R into 3NF.

Q3(a) Given a set of FDs for the relation schema R(ABCD) with primary key AB under
which R is 1NF but not in 2NF
(b) Find FDs such that R is in 2NF but not in 3NF

Sol: R=ABCD
Key=AB
(a) Atomic values are allowed in 1NF and partial dependency is not allowed in 2NF.
The following FDs are allowed.
B C, AC, B D, A D
(show the FDs which is having partial dependency)
(b) According to question partial dependencies are not allowed and transitivity
dependency is allowed. The following FDs are allowed.
C D, DC

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 22

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Boyce-Codd normal form (BCNF): A relation is said to be in BCNF, if and only if every
determinant should be a candidate key.
✓ BCNF is the advance version of 3NF. It is stricter than 3NF.
✓ A table is in 3NF if for every functional dependency X → Y, X is the super key of the table.
✓ For BCNF, the table should be in 3NF and for every FD, LHS is super key.

Example: Let's assume there is a company where employees work in more than one
department. EMPLOYEE table:

emp_id emp_nationality emp_dept dept_type dept_no_of_emp

1001 Austrian Production and planning D001 200

1001 Austrian stores D001 250

design and technical

1002 American D134 100
support

1002 American Purchasing department D134 600

➔ In the above table Functional dependencies are as follows: EMP_ID → EMP_COUNTRY

and EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO} Candidate key: {EMP-ID, EMP-DEPT}
➔ The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys. To
convert the given table into BCNF, we decompose it into three tables:

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 23

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

FOR BCNF problems refer your note book.

Q1 Consider the relation schema R(A,B,C), which has the FD B → C. If A is a candidate key for R,
is it possible for R to be in BCNF? If so, under what conditions? If not, explain why not.
Sol : The only way R could be in BCNF is if B includes a key, i.e. B is a key for R

Fourth Normal Form (4NF): A relation said to be in 4NF if it is in Boyce Codd normal
form and should have no multi-valued dependency.
✓ For a dependency A→ B, if for a single value of A, multiple value of B exists then the
relation will be multi-valued dependency.
✓ Note: Multi Valued Dependency: A table is said to have multi-valued dependency, if the
following conditions are true,
1. For a dependency A → B, if for a single value of A, multiple value of B exists, then the
table may have multi-valued dependency.
2. Also, a table should have at-least 3 columns for it to have a multi-valued dependency.
3. And, for a relation R (A, B, C), if there is a multi-valued dependency between, A and
B, then B and C should be independent of each other.
◼ If all these conditions are true for any relation (table), it is said to have multi-valued
dependency.

Example

 The given STUDENT table is in 3NF but the COURSE and HOBBY are two independent
entity. Hence, there is no relationship between COURSE and HOBBY. In the STUDENT
relation, student with STU_ID, 21 contains two courses, Computer and Math and two
hobbies, Dancing and Singing. So there is a Multi-valued dependency on STU_ID,
which leads to un-necessary repetition of data.
 So to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 24

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

STUDENT_HOBBY

Concept of Surrogate Key:

✓ Alternate of Primary Key that allows duplication of data’s/records.
✓ Surrogate key is a unique identification key, it is like an artificial key to production key,
because the production key may be alphanumeric or composite key but the surrogate key is
always single numeric key.
✓ A surrogate key has the following characteristics:
i. The value is never reused and is unique within the whole system.
ii. It is system generated and an integer.
iii. The value cannot be manipulated by the user or application.
iv. The value is not an amalgam of different values from multiple domains.
✓ A Surrogate Keys can be generated in a variety of ways, and most databases offers ways to
generate surrogate keys.
Example: Oracle uses SEQUENCE,
MYSQL uses Auto_Increment,
and SQL Server uses IDENTITY.

Lossless join and Dependency preserving decomposition:

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 25

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 26

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 27

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

Review Questions

1. What is Functional Dependency? Explain types and properties of FD’s.

2. What is a normal form? Explain about various normal forms with examples.
3. What is normalization? Differentiate between second normal form and third normal
form.
4. Explain briefly about 3NF, 4NF with suitable examples?
5. Explain about Boyce Codd normal form with an example.
6. Why normalization is needed? Explain the process of normalization.
7. How to compute closure of set of functional dependency? Explain with a suitable
example schema.
8. What is multi valued dependency? State and explain fourth normal form based on this
concept.
9. List and explain the inference rules of functional dependencies.
10. Explain insertion, deletion, and modification anomalies.
11. What is the importance of dependency preservation during decomposition? How to
achieve it?
12. Consider the relation R on attributes (ABCDE) with functional dependencies:
AB CDE, AC BDF, BC, CB, C D, BE
i) Determine a Key for relation R
ii) Find 3NF decomposition for R using normalization process

13. Give asset o FDs for the relation schema R(A,B,C,D) with primary key AB under which
R is in 1NF but not in 2NF.
14. Why is a relation that is in 3NF generally considered good? Explain.
15. Discuss about 4NF with suitable example.
16. What are the problems caused by redundantly storing information? Explain
17. Given Relation, R=(A,B,C,D,E,F,G) and Functional Dependencies
F={ {A,B}→{C}, { A,C}→{B}, {A,D}→{E}, {B}→{D}, { B,C}→{A}, {E}→{F}}.
Check whether the following decomposition of R into R1=(A,B,C), R2=(A,C,D,E) and
R3=(A,D,F) is satisfying the lossless Decomposition property.
18. What is dependency preservation property for decomposition? Explain why it is important.
19. Given a Relation R=(X,Y,Z) and Functional Dependencies are F={ {X,Y}→{Z}, {Z}→{X} }
Determine all Candidate keys of R and the normal form of R with proper explanation.
20. Define functional dependency? How can you compute the minimal cover for a set of
functional dependencies? Explain it with an example.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 28

DATABASE MANAGEMENT SYSTEMS UNIT – IV : NORMALIZATION

21. Consider schema R = (A, B, C, G, H, I) and the set F of functional dependencies {AB, AC,
CG H, CGI, BH}. Compute the candidate keys of the schema. Compute the closure of the
same.
22. Explain 3NF & BCNF. What is the difference between them?
23. What is functional dependency? Explain its usage in database design.
24. What is a surrogate key? How can it be used for schema refinement?
25. How to compute closure of set of functional dependency? Explain with a suitable example schema.
26. What is multi valued dependency? State and explain fourth normal form based on this concept.
27. Given a set of FDs for the relation schema R(A,B,C,D) with Primary key AB, and D C or
C D or AC D or AD C or BC D or BD C. In which normal form is R?
28. Discuss the problems caused by redundancy and the purpose of normalization.
29. Give relation schemas for the following normal forms
i) 2NF but not in 3NF ii) 3NF but not in BCNF

References:

 Raghurama Krishnan, Johannes Gehrke, Database Management Systems, 3rd Edition, Tata
McGraw Hill.
 C.J. Date, Introduction to Database Systems, Pearson Education.
 Elmasri Navrate, Fundamentals of Database Systems, Pearson Education.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 29

DBMS Ninja Notes
No ratings yet
DBMS Ninja Notes
134 pages
World Cup Analysis
No ratings yet
World Cup Analysis
15 pages
Dip Lecture - Notes Final 1
No ratings yet
Dip Lecture - Notes Final 1
173 pages
Capgemini Interview Questions
No ratings yet
Capgemini Interview Questions
1 page
Chap 17
No ratings yet
Chap 17
142 pages
Unit-Iv: I. Pitfalls in Relational Database Design
No ratings yet
Unit-Iv: I. Pitfalls in Relational Database Design
23 pages
Design and Analysis of Algorithm: Lab File
No ratings yet
Design and Analysis of Algorithm: Lab File
58 pages
DSTLCST Notes Aktu
No ratings yet
DSTLCST Notes Aktu
129 pages
Electrical and Electronics Engineering S7 & S8
No ratings yet
Electrical and Electronics Engineering S7 & S8
362 pages
DM - Unit 1 - PPT
No ratings yet
DM - Unit 1 - PPT
92 pages
March 2015 To July 2020 FULL Papers
No ratings yet
March 2015 To July 2020 FULL Papers
110 pages
Cosm Unit 1
No ratings yet
Cosm Unit 1
24 pages
DSTL Unit 3
No ratings yet
DSTL Unit 3
126 pages
Updated Syllabus - ME CSE Word Document PDF
No ratings yet
Updated Syllabus - ME CSE Word Document PDF
62 pages
Relations
No ratings yet
Relations
60 pages
UHV Manual
No ratings yet
UHV Manual
52 pages
Unit-3 Part 1 Normalization
No ratings yet
Unit-3 Part 1 Normalization
31 pages
7BCEE1A-Datamining and Data Warehousing
No ratings yet
7BCEE1A-Datamining and Data Warehousing
128 pages
CH1 The Foundations - Logic and Proofs
No ratings yet
CH1 The Foundations - Logic and Proofs
106 pages
Hasse Diagram
No ratings yet
Hasse Diagram
8 pages
JNTUA R20 B.tech - CSE III IV Year Course Structure Syllabus
No ratings yet
JNTUA R20 B.tech - CSE III IV Year Course Structure Syllabus
117 pages
DBMS Unit - 4
No ratings yet
DBMS Unit - 4
29 pages
Crimes in India
No ratings yet
Crimes in India
88 pages
Lecture11 19437 Zero Lecture MTH174
No ratings yet
Lecture11 19437 Zero Lecture MTH174
71 pages
Data Structures (2D Array)
No ratings yet
Data Structures (2D Array)
18 pages
DWM Course
No ratings yet
DWM Course
67 pages
ML Class Presentation Notes
No ratings yet
ML Class Presentation Notes
51 pages
Blind 75 PDF
No ratings yet
Blind 75 PDF
129 pages
UNIT - IV Chapter 1: Relational Database Design Via Er Modelling
No ratings yet
UNIT - IV Chapter 1: Relational Database Design Via Er Modelling
60 pages
Bosch Sample Question Paper
No ratings yet
Bosch Sample Question Paper
25 pages
DFA Minimization
No ratings yet
DFA Minimization
10 pages
Fluid Mechanics Notes
No ratings yet
Fluid Mechanics Notes
41 pages
VL2019205005389 Da PDF
No ratings yet
VL2019205005389 Da PDF
44 pages
UNIT - (II) PDF
No ratings yet
UNIT - (II) PDF
19 pages
2a Imp Questions (March - 2021)
No ratings yet
2a Imp Questions (March - 2021)
14 pages
(Fofr'+ Effifnf: 1fre (Sffi/Qf'Rsf
No ratings yet
(Fofr'+ Effifnf: 1fre (Sffi/Qf'Rsf
122 pages
I Sem Bca QB
No ratings yet
I Sem Bca QB
27 pages
Latest Q Bank (Ed-8.5) of Maths IV
No ratings yet
Latest Q Bank (Ed-8.5) of Maths IV
28 pages
Instance Based Learning: Artificial Intelligence and Machine Learning 18CS71
No ratings yet
Instance Based Learning: Artificial Intelligence and Machine Learning 18CS71
19 pages
Issues in Decision Tree Learning
No ratings yet
Issues in Decision Tree Learning
14 pages
DS - Unit 3 - Notes
100% (1)
DS - Unit 3 - Notes
13 pages
4th Normal Form of Normalization
No ratings yet
4th Normal Form of Normalization
19 pages
Chapter 13 Solutions
No ratings yet
Chapter 13 Solutions
12 pages
Dbms Vtu Question Paper
No ratings yet
Dbms Vtu Question Paper
10 pages
∏ (Σ (P×R) ) −∏ (Σ (Q×R) ) 2. Q: R⋈ (Σ (S) ) : Σ (R⋈S) B) Σ (Rlojs) Rloj (Σ (S) ) D) Σ (R) Lojs
No ratings yet
∏ (Σ (P×R) ) −∏ (Σ (Q×R) ) 2. Q: R⋈ (Σ (S) ) : Σ (R⋈S) B) Σ (Rlojs) Rloj (Σ (S) ) D) Σ (R) Lojs
6 pages
Dbms r19 - Unit-2 (Ref-2)
No ratings yet
Dbms r19 - Unit-2 (Ref-2)
27 pages
MSCIT Syllabus
No ratings yet
MSCIT Syllabus
30 pages
Unit 3 - Computer Graphics & Multimedia - WWW - Rgpvnotes.in
No ratings yet
Unit 3 - Computer Graphics & Multimedia - WWW - Rgpvnotes.in
30 pages
P.E.S. College of Engineering MANDYA, 571401: A Project Report On
No ratings yet
P.E.S. College of Engineering MANDYA, 571401: A Project Report On
47 pages
Mumbai University Syllabus
No ratings yet
Mumbai University Syllabus
30 pages
Scheme of Studies For BS Mathematics
No ratings yet
Scheme of Studies For BS Mathematics
17 pages
P and S Gtu Pyq Past 3 Years
No ratings yet
P and S Gtu Pyq Past 3 Years
19 pages
DBMS (R20) Unit - 3
No ratings yet
DBMS (R20) Unit - 3
67 pages
DBMS Interview Questions (2021) - Javatpoint
No ratings yet
DBMS Interview Questions (2021) - Javatpoint
17 pages
Migrating Into A Cloud: (The Seven-Step Model of Migration Into A Cloud VM Migration and Cloud Middleware)
No ratings yet
Migrating Into A Cloud: (The Seven-Step Model of Migration Into A Cloud VM Migration and Cloud Middleware)
16 pages
Uml&Dp Unit 1 IV I Btech Jntuk
No ratings yet
Uml&Dp Unit 1 IV I Btech Jntuk
21 pages
Dbms Assignment: Solution
No ratings yet
Dbms Assignment: Solution
6 pages
Daa Notes Unit 4
No ratings yet
Daa Notes Unit 4
14 pages
Web Question Bank
No ratings yet
Web Question Bank
6 pages
Relational Algebra
No ratings yet
Relational Algebra
24 pages
Vlsi Ieee 2022-23 - 9581464142
No ratings yet
Vlsi Ieee 2022-23 - 9581464142
10 pages
Select From Employee Where Rowid Select Max (Rowid) From Employee
No ratings yet
Select From Employee Where Rowid Select Max (Rowid) From Employee
5 pages
Question Paper CSE 3RD SEM
No ratings yet
Question Paper CSE 3RD SEM
10 pages
DBMS III UNIT Full Notes
No ratings yet
DBMS III UNIT Full Notes
45 pages
DBMS TW-2
No ratings yet
DBMS TW-2
6 pages
Homework 5
No ratings yet
Homework 5
5 pages
Maths Question Bank
No ratings yet
Maths Question Bank
6 pages
1) What Is 4NF?, Explain With An Example ?: Answer: Fourth Normal Form
No ratings yet
1) What Is 4NF?, Explain With An Example ?: Answer: Fourth Normal Form
8 pages
Internship PPT Outlines
No ratings yet
Internship PPT Outlines
9 pages
COIS Level 4 Unit 1
No ratings yet
COIS Level 4 Unit 1
2 pages
DBMS (R20) Unit - 5
No ratings yet
DBMS (R20) Unit - 5
32 pages
DBMS - Notes
No ratings yet
DBMS - Notes
101 pages
DBMS 2: Anomalies + Normalization PDF
No ratings yet
DBMS 2: Anomalies + Normalization PDF
10 pages
2.3 Normalization
No ratings yet
2.3 Normalization
18 pages
Chapter 3
No ratings yet
Chapter 3
103 pages
DBMS Data Models
No ratings yet
DBMS Data Models
12 pages
Normalization
No ratings yet
Normalization
8 pages
Lec 5a
No ratings yet
Lec 5a
24 pages
DBMS (R20) Unit - 4
No ratings yet
DBMS (R20) Unit - 4
29 pages
Normalization in Databases
No ratings yet
Normalization in Databases
61 pages
Unit-3 DBMS
No ratings yet
Unit-3 DBMS
45 pages
Previous Year Question Paper - Accenture 6
No ratings yet
Previous Year Question Paper - Accenture 6
3 pages
GATE Ques Set 2
No ratings yet
GATE Ques Set 2
14 pages
DBMS ENDSEM Solved Question Paper
No ratings yet
DBMS ENDSEM Solved Question Paper
13 pages
DBMS Unit 3 Sem Exam
No ratings yet
DBMS Unit 3 Sem Exam
64 pages
Previous Year Question Paper - Accenture 8
No ratings yet
Previous Year Question Paper - Accenture 8
3 pages
Question Bank (Students)
No ratings yet
Question Bank (Students)
3 pages
MCA DBMS Normalization
No ratings yet
MCA DBMS Normalization
28 pages
Relational Model and Normal Forms - DPP 01
No ratings yet
Relational Model and Normal Forms - DPP 01
4 pages
Unit-2 Relational Model & Normalization (1NF 2NF 3NF BCNF)
No ratings yet
Unit-2 Relational Model & Normalization (1NF 2NF 3NF BCNF)
42 pages
R22 BEFA All Units Questions & Answers 03-8-2024
No ratings yet
R22 BEFA All Units Questions & Answers 03-8-2024
87 pages
Data Types Homework 3 Relational Databases and Normalisation
No ratings yet
Data Types Homework 3 Relational Databases and Normalisation
3 pages
Relational Algebra Operations: Understanding Basics of Query Processing!!
No ratings yet
Relational Algebra Operations: Understanding Basics of Query Processing!!
26 pages
DB Theory CCP-Assignment
No ratings yet
DB Theory CCP-Assignment
5 pages

DBMS (R20) Unit - 4

Uploaded by

DBMS (R20) Unit - 4

Uploaded by

Aditya College of Engineering & Technology

Aditya Nagar, ADB Road, Surampalem - 533437

DATABASE MANAGEMENT SYSTEMS

UNIT IV: Schema Refinement (Normalization)

INTRODUCTION TO SCHEMA REFINEMENT

Anomalies or problems facing without normalization (problems due to

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 2

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 3

Solutions To Anomalies : Decomposition of Tables – Schema Refinement, shown below.

 Minimize the redundancy in data.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 4

Concept of Functional Dependency:

Functional Dependencies are fundamental to the process of Normalization i.e., Functional

Example [Identifying the FD’s]

Reasoning about functional dependencies:

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 5

Various axioms rules or inference rules:

Secondary or derived axioms:

Closure of a Set of Attributes:

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 6

Types of functional dependencies:

1. Fully Functional Dependency: A functional dependency is said to be full dependency

2. Partial Functional Dependency: If a non-prime attribute of the relation is getting

3. Transitive Functional Dependency: If a non-prime attribute of a relation is getting

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 7

5. Multi-Valued Dependency: Consider 3 fields X, Y, and Z in a relation. If for each value

Prime and non-prime attributes

Operations performed functional dependencies (applications of closure set of

(1) To identify the additional F.D’s :

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 8

Sol: Given FD is CDAC find the closure set of CD.

CD+ = CDE (∵ CD E)

(2) Identification of key by using closure set as attributes:

B+ = B (not a candidate key, find the combinations of B)

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 9

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 10

(3) To identify equivalence of F.D’s :

Find the equivalence of two sets of FDs.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 11

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 12

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 13

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 14

Remove 4 and compute D+ from 1, 2, 4, 5&6

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 15

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 16

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 17

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 18

Note: Partial Functional Dependency: If a non-prime attribute of the relation is getting

Example: Consider the following relation

Q1 Given relation R(ABCD) and F:{ABC, BD} Decompose in into 2NF.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 19

Example: Consider the following relation.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 20

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 21

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 22

emp_id emp_nationality emp_dept dept_type dept_no_of_emp

1001 Austrian Production and planning D001 200

1001 Austrian stores D001 250

design and technical

1002 American Purchasing department D134 600

➔ In the above table Functional dependencies are as follows: EMP_ID → EMP_COUNTRY

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 23

FOR BCNF problems refer your note book.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 24

Concept of Surrogate Key:

Lossless join and Dependency preserving decomposition:

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 25

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 26

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 27

1. What is Functional Dependency? Explain types and properties of FD’s.

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 28

ADITYA COLLEGE OF ENGINEERING AND TECHNOLOGY 29

You might also like