0% found this document useful (0 votes)
113 views32 pages

Normalization

The document discusses normalization of SQL databases. It defines normalization as a technique to minimize redundancy and dependency in database schemas by splitting large tables into smaller tables and defining relationships between them. The document covers various normal forms including 1NF, 2NF, 3NF, BCNF and anomalies like insert, delete and update anomalies. It provides examples to explain different normal forms and how tables can be normalized to eliminate anomalies and redundancy by following the normal forms.

Uploaded by

Syed Ameer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
113 views32 pages

Normalization

The document discusses normalization of SQL databases. It defines normalization as a technique to minimize redundancy and dependency in database schemas by splitting large tables into smaller tables and defining relationships between them. The document covers various normal forms including 1NF, 2NF, 3NF, BCNF and anomalies like insert, delete and update anomalies. It provides examples to explain different normal forms and how tables can be normalized to eliminate anomalies and redundancy by following the normal forms.

Uploaded by

Syed Ameer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

NORMALIZATION

SQL DATABASES
CONTENTS:
1. What is Normalization ↗
2. Anomaly and their types ↗
3. Types of Normalization ↗
4. 1NF ↗
5. 2NF ↗
6. 3NF ↗
7. BCNF ↗
8. 4NF
9. 5NF
Normalization
• Database normalization is a database schema design technique, by
which an existing Schema is modified to minimize redundancy and
dependency of data.
• Normalization split a large table into smaller tables and define
relationships between them to increase the clarity in organizing data.
Why is normalization needed?

• To avoid redundancy.
• To avoid/minimize anomalies and other issues.
Anomaly
• Problems that occur in poorly planned, unnormalized databases where all
the unrelated data is stored in one table.
Types of Anomalies
• insert
• delete
• update
Insert anomaly

• Insert anomaly occurs when data of certain attributes cannot be inserted into the
database without the presence of other attributes.
Delete anomaly

• Delete anomaly exists when data of certain attributes are lost because of the
deletion of other attributes.
Update anomaly

• An update anomaly exists when one or more instances of duplicated data is


updated, but not all.
The following table is used as an example.
Student Table:

stu_id stu_name location course_name course_duration

s01 Anitha Madurai C++ 6 months


s02 Soundharya Madurai C++ 6 months
s03 Maheswari Chennai JAVA 5 months
s04 Sahana Dindigul JAVA 5 months
s05 Malaviga Coimbatore JAVA 5 month
s06 Keerthana Dindigul Python 3 months
s07 Dharmendhini Chennai Python 3 months
s08 Shibana Salem Web designing 4 months
Redundancy:
• In the table, the attributes location, course_name and
course_duration contain repeated values. This leads to storage space wastage
and also creates anomalies.
Insert anomaly:
• If a new course is introduced in the institution, unless a student is admitted to it, it can’t
be included. The same problem exists for the inclusion of new student too. A newly
admitted student details can’t be recorded until he/she joins a course.
Delete anomaly:
• If a course is to be dropped by the institution, it cannot be removed from the table. If we
do so, the details of the students studied that course will also be deleted.
Update anomaly
• If there is a need to change the course_duration of course JAVA from 5 months to 6
months, this will not be updated in all the locations. If you observe the table keenly, you
can find in the 5th record contains month whereas 3rd and 4th record contains months.
The above-discussed redundancies and anomalies can be removed by splitting
the table into two tables which follow.

Course Table:
Course Name Course Duration
C++ 6 months

Java 5 months

Python 3 months

Web Designing 4 months


Types of Normalization

• First Normal Form (1NF)


• Second Normal Form (2NF)
• Third Normal Form (3NF)
• Boyce Codd Normal Form (BCNF)
• Fourth Normal Form (4NF)
• Fifth Normal Form (5NF)
FIRST NORMAL FORM
[ 1NF ]
A table is said to be in first normal form, if the following conditions exist:
• The table should contain only atomic-valued attributes.
• Values stored in the column should be of the same domain.
• Every column name should be unique.
• No duplicate rows should exist.
• Primary/Composite key should be defined.
• Ordering of rows and columns doesn't matter.
The following table is taken for example discussion:
Consider p1 and p2 as valid contact numbers.

• In the attributes address and contact_no, multivalued attributes are present. It


should be split into two attributes to convert the table into 1NF.
• The values in the age attribute are not in the same domain. It should be either
numeric or character.
After doing the changes, the table becomes,

stud_id stud_name age City State contact_no

s001 Geetha 18 Chennai Tamilnadu P1

s001 Geetha 18 Chennai Tamilnadu P2

s002 Keerthi 18 Madurai Tamilnadu P1

s003 Sharmina 17 Coimbatore Tamilnadu P1

s003 Sharmina 17 Coimbatore Tamilnadu P2

s004 Gowri 17 Tirunelveli Tamilnadu P1

s005 Sangeetha 18 Theni Tamilnadu P1


If most of the students have more than one contact no, then the table consumes
more storage space. So to minimize the space as well as for better performance,
we split the table into 2 tables having stud_id as the primary key in both the tables
which follows.
CONTACT
stud_id contact_no

s001 P1

s001 P2

s002 P1

s003 P1

s003 P2

s004 P1

s005 P1
SECOND NORMAL FORM
[ 2NF ]
A table is said to be in 2NF, if it satisfies the following two conditions.
1. The table should be in 1NF.
2. The table should not contain any partial dependencies.
(or)
All non-prime attributes should depend on the entire primary key, not on the
part of any primary key. (i.e. candidate key)
(or)
No non-prime attribute is dependent on any proper subset of any candidate
key of the table.
Partial Dependency

If the values of the non-prime attributes of a table is uniquely identifiable by


using the subset of candidate keys, then it is called as Partial Dependency.

Conversion of 1NF to 2NF

If a partial dependency exists, remove the partially dependent attribute(s)


from the relation by placing them in a new relation along with a copy of their
determinant.
Example 1:
EMPLOYEE
eid ename dob salary

• The attribute eid (employee id) is the primary key for the EMPLOYEE table.
• All other non-prime attributes ename (employee name), dob (date of birth) and salary
are completely dependent on the primary key attribute eid which means only by
using eid value, values of all other non-prime attributes in the table are uniquely
identifiable.
Example 2:
stud_no course_no course_fee · In the STUDENT table, the attributes stud_no and course_no
are the primary key attributes because any student can
101 C1 1500
learn any number of courses.
101 C3 1500
· Here the non-prime attribute course_fee is partially dependent
because its value can be identified by using course_fee alone.
102 C1 1500
· Hence the table is not in 2NF.
· To convert this table into 2NF, we split it into two tables which
103 C1 1500 are in 2NF.
· stud_no and course_no are the primary key for the tables
104 C2 2000 STUDENT and COURSE respectively.

104 C3 1500

105 C4 3000
STUDENT
stud_no course_no COURSE
101 C1 course_no course_fee
101 C3
C1 1500
102 C1
C2 2000
103 C1
C3 1500
104 C2

104 C3 C4 3000

105 C4
THIRD NORMAL FORM
[ 3NF ]
A table is said to be in 3NF or third normal form, if the following requirements are
satisfied.
• All 2NF requirements are fulfilled.
• There is no transitive dependency. (i.e.) A non-key column should not depend on
another non-key column.
(OR)
A relation is in 3NF if at least one of the following conditions holds in every non-trivial
functional dependency X -> Y
• X is a super key.
• Y is a Prime attribute.
Conversion to 3NF

Remove the transitively dependent attribute from the relation and place it in a
new relation along with the determinant.
Example: BOOK_DETAILS
Book_ID Category_ID Category_type Price
101 1 Competitive 600
102 2 Sports 350
103 2 Sports 275
104 1 Competitive 750
105 3 Novel 500
106 4 Poetry 350
• In the relation BOOK_DETAILS, the Book_ID attribute is the Primary Key attribute.
Because using BOOK_ID the values of all attributes such as Category_ID,
Category_type and Price of every record can be uniquely identified.
• The Non-prime attributes Price, Category_ID and Category_type are dependent on the
Primary Key.
• But the Category_type is also dependent on the non-prime attribute Category_ID.
• So it is transitively dependent on the Primary Key. i.e. The Primary Key attribute
BOOK_ID can find the Category_type through Category_ID too.

So to convert the relation to 3NF it is decomposed into 2 relations such as given below
BOOK_DETAILS CATEGORY_DETAILS
Book_ID Category_ID Price Category_ID Category_type

101 1 600
1 Competitive

102 2 350
2 Sports

103 2 275
3 Novel

104 1 750
4 Poetry

105 3 500
5 Puzzle
106 4 350
BOYCE CODD NORMAL
FORM
[ BCNF ]
• It is also called as 3.5NF
A table is said to be BCNF, if it satisfies the following conditions.
1. It should satisfy 3nf
2. All determinants must be candidate key.

You might also like