0% found this document useful (0 votes)
50 views11 pages

Database Design - Normalization

The document discusses database normalization. It defines normalization as the process of decomposing large tables into smaller tables without losing data to improve data integrity and reliability. The document outlines the various normal forms including 1NF, 2NF, 3NF, BCNF, 4NF and 5NF. It provides examples of tables and how to decompose them to satisfy each normal form through removing anomalies like transitive dependencies and multivalued dependencies. The benefits of normalization include using less storage space, allowing quicker updates, and having a more flexible structure.

Uploaded by

BIL BROTHERS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views11 pages

Database Design - Normalization

The document discusses database normalization. It defines normalization as the process of decomposing large tables into smaller tables without losing data to improve data integrity and reliability. The document outlines the various normal forms including 1NF, 2NF, 3NF, BCNF, 4NF and 5NF. It provides examples of tables and how to decompose them to satisfy each normal form through removing anomalies like transitive dependencies and multivalued dependencies. The benefits of normalization include using less storage space, allowing quicker updates, and having a more flexible structure.

Uploaded by

BIL BROTHERS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

18CS440 / Database Management Systems 2020

NORMALIZATION

 One of the most challenging aspects of traditional database design is


Normalization.
 It is the process of decomposing large, inefficiently structured tables into smaller,
more efficiently structured tables without losing any data in the process.
 Guarantee data integrity and ensures that the information retrieved from the
database will be accurate and reliable.
 E.F. Codd originally established three normal forms: 1NF, 2NF and 3NF.
 3NF is widely considered to be sufficient for many practical applications.
 There are seven Normal Forms, and each one was created to deal with specific
types of problems.
First Normal Form Based on Functional Dependency
Second Normal Form Based on Functional Dependency
Third Normal Form Based on Functional Dependency
Fourth Normal Form Based on Multivalued Dependency
Fifth Normal Form Based on Join Dependency
Boyce/Codd Normal Form Based on Functional Dependency
Domain/Key Normal Form ,, the definition of Domains and Keys

Benefits of Normalization
• Less storage space • Clearer data relationships
• Quicker updates • Easier to add data
• Less data inconsistency • Flexible Structure

Dr. A.M. Rajeswari, CSE, TCE Page 1 of 11


18CS440 / Database Management Systems 2020

 There are a couple of things you need to check before you start the Normalization
process.
1. Each table must have a Primary Key.
2. A table cannot contain repeating groups of data.

A non normalized Orders table


 1NF
A relation schema R is in the first normal form if the domain of its each attribute
has only atomic values ( No attribute is allowed to be composite or multi valued
also any repeating group of fields )

Orders table in First Normal Form

Dr. A.M. Rajeswari, CSE, TCE Page 2 of 11


18CS440 / Database Management Systems 2020

2NF

 A relation schema R is in the second normal form w.r.t. F if:


1. It is in 1NF
2. Every Non-Primary-Key attribute is fully functionally dependent
upon the ENTIRE Primary-Key for its existence
 Orders table has three distinct problems
1. It contains a calculated field (Total)
2. It contains a transitive dependency between OrderID and Product
3. It actually describes two subjects: Orders and OrderDetails

Dr. A.M. Rajeswari, CSE, TCE Page 3 of 11


18CS440 / Database Management Systems 2020

 Note that the transitive dependency and calculated field have migrated to the table
Order-details
 Remove the calculated field from the table from the table.
 The table describes two different subjects because it has the transitivity
dependency Order-Details and Products.

Dr. A.M. Rajeswari, CSE, TCE Page 4 of 11


18CS440 / Database Management Systems 2020

3NF

 A relation schema R is in the third normal form if and only if it is in 2NF and
every non-key attribute is non-transitively dependent on the Primary Key.
 As the definition states, a table must already be in Second Normal Form before
you can apply Third Normal Form. If this is the case, you then apply Third
Normal Form to ensure that the table has the following characteristics:
a. Each field value is independently updateable ( non_transitivity); changing
the value for one field in a given record does not adversely affect the
value of any other field in that record.
b. Each field identifies a specific characteristic of the table's subject.
c. Each non-key field in the table is functionally dependent upon the entire
Primary Key.
d. The table describes one and only one subject.

Dr. A.M. Rajeswari, CSE, TCE Page 5 of 11


18CS440 / Database Management Systems 2020

 Price field does not describe the table's subject.


 Price doesn't represent a specific characteristic of an order detail as much as it
describes a specific characteristic of a particular product.
 Price value is actually determined by ProductID.
 OrderDetails table has a Composite Primary Key consisting of OrderID and
ProductID, the value of Price is not dependent on the entire Primary Key, as
required by Third Normal Form.
 Hence Price field can be removed from the Orders-Detail table.

Boyce/Codd Normal Form


 Is a different version of Third Normal Form and, indeed, was
meant to replace it. The purpose of Boyce/Codd Normal Form is twofold.
1. It ensures that a field that determines the value of any or all non-key fields
in a table must be a Candidate Key for that table.
( every determinant must be a candidate key )
2. It ensures that a table that describes one and only one subject. ( This is
implied by enforcing Candidate Keys.)

Dr. A.M. Rajeswari, CSE, TCE Page 6 of 11


18CS440 / Database Management Systems 2020

 The table is free of transitive dependencies, and by extension, free of modification


anomalies.
 The difference between 3NF and BCNF is that for a FD A  B, 3NF allows this
dependency in a relation if B is a primary-key attribute and A is not a candidate
key, whereas BCNF insists that for this dependency to remain in a relation, A
must be a candidate key.

 OrderDetails table with three determinants:


1. OrderID and LineItemNumber
2. OrderID and ProductID
3. ProductID
 In order to apply Boyce/Codd Normal Form, you first need to identify if
these determinants are Candidate Keys of the table.
 OrderID\ProductID and OrderID\LineItemNumber are the only Candidate Keys.
 Although ProductID is not a Candidate Key, it does determine the value of
Product and Price.
 Product and Price fields are involved in transitive dependencies with both
Candidate Keys.
 Hence remove the Product and Price fields from the table.

Dr. A.M. Rajeswari, CSE, TCE Page 7 of 11


18CS440 / Database Management Systems 2020

 Now the table will be in Boyce/Codd Normal Form.

 BCNF was proposed as a simpler form of 3NF, but it was found to be stricter than
3NF, because every relation in BCNF is also in 3NF. However a relation in 3NF is
not necessarily in BCNF.

4NF

 A relation schema R is in the fourth normal form if and only if, whenever there
exist subsets A and B of the attributes of R such that the (nontrivial) MVD
AB is satisfied, then all attributes of R are also functionally dependent on A.

 The purpose of 4NF is to ensure that a table does not contain any multi-valued
dependencies, and that it describes one and only one subject.

Dr. A.M. Rajeswari, CSE, TCE Page 8 of 11


18CS440 / Database Management Systems 2020

 A table containing multi-valued dependencies describes two or more subjects,


depending on the number of multi-valued dependencies present.
 Table - EmployeeCommittees contains a single multi-valued dependency. The
first version has a field-level multi-valued dependency and the second version
contains a record-level multi-valued dependency.

 To remove the multivalued dependencies


1. Create a new table using the Primary Key and the first multi-valued field.
Give the new table an appropriate name.
2. Create another new table using the Primary Key and the second multi-
valued field and so on….
3. Remove all the multi-valued fields from the original table and with the
remaining fields create another table (original).

 Example – 4NF
 The table EmployeeInformation contains two multivalued dependencies
- EmployeeID  Language
( Table 1 : EmployeeLanguages )
- EmployeeID  DeveloperCertification
( Table 2 : EmployeeCertificates)

Dr. A.M. Rajeswari, CSE, TCE Page 9 of 11


18CS440 / Database Management Systems 2020

5NF
 A relation schema R is in the fifth normal form – also called Projection / Join
Normal Form (PJ/NF) – if and only if every non-trivial join dependency that holds
for R is implied by the Candidate Keys of R.

 A join dependency exists for a given table if the table and all of its original records
can be reconstructed by an SQL JOIN operation that reunites all tables created by
its decomposition.

 A table in 4NF should be free of all transitive and multi-valued dependencies. In


most cases, you shouldn't need to decompose the table any further.
Dr. A.M. Rajeswari, CSE, TCE Page 10 of 11
18CS440 / Database Management Systems 2020

 If you suspect that you can (or should) decompose the table check the following.
1. Can the new table(s) using the Primary Key or a Candidate Key as part of
the new table structure?
2. Can recreation of the original table by using an SQL JOIN operation that
reunites all of the tables recreated by the decomposition?
3. Will any records will be lost in the process of decomposing the
table?

 Example – Join Dependency


 No transitive and multi-valued dependencies
 Decomposed in order to secure the confidential information.

*******************

Dr. A.M. Rajeswari, CSE, TCE Page 11 of 11

You might also like