2 - Normalization in DBMS
2 - Normalization in DBMS
📌 Lecture Outline
1. Definition of Normalization
2. Goal of Normalization
📌 1. Definition of Normalization
📌 Definition:
Normalization is the process of organizing data in a relational database to reduce
redundancy and improve data integrity by breaking a larger table into smaller related
tables.
✅ Why Normalize?
📌 2. Goal of Normalization
📍 The main objectives of normalization are: ✅ Eliminate Redundant Data – Reduce data
duplication in multiple places.
✅ Organize Data Logically – Store data in smaller, well-structured tables.
✅ Improve Database Performance – Minimize disk space usage and optimize queries.
✅ Ensure Data Integrity – Prevent inconsistencies caused by anomalies.
✅ Make Data More Flexible – Easy to update without affecting other records.
When a table is not normalized, it leads to data anomalies, which affect the correctness of
the database.
🔹 A. Update Anomalies
📌 Occurs when a change in one record requires multiple updates in different places.
✅ Example:
pgsql
CopyEdit
---------------------------------------------
🔹 B. Deletion Anomalies
✅ Example:
markdown
CopyEdit
---------------------------------------------
P1 | AI Dev | IT | Alice
P2 | HR App | HR | Bob
❌ Problem: If Project P2 is deleted, we lose HR department info.
🔹 C. Insert Anomalies
✅ Example:
pgsql
CopyEdit
---------------------------------------------
❌ Problem: Cannot add a new instructor without at least one student enrolled.
Normalization is done in progressive steps from 1NF → 5NF to ensure a fully optimized
database.
1. Each column should have atomic values (no multiple values in one cell).
pgsql
CopyEdit
---------------------------------
✅ After 1NF
pgsql
CopyEdit
--------------------------------
101 | Alice | AI
102 | Bob | AI
2. Remove partial dependencies (A part of the primary key should not determine a
non-key attribute).
diff
CopyEdit
--------------------------------------------
101 | C1 | DBMS
101 | C2 | AI
✅ After 2NF
pgsql
CopyEdit
Student (Stud_ID, Name)
------------------------
101 | Alice
102 | Bob
--------------------------------
C1 | DBMS
C2 | AI
--------------------------------
101 | C1
101 | C2
102 | C2
markdown
CopyEdit
--------------------------------------------
✅ After 3NF
pgsql
CopyEdit
-----------------------------
101 | Alice | IT
102 | Bob | HR
--------------------------------
IT | New York
HR | Los Angeles
diff
CopyEdit
---------------------------------
P1 | DBMS | CS
P2 | AI | CS
diff
CopyEdit
---------------------------
P1 | DBMS
P2 | AI
---------------------
DBMS | CS
AI | CS
diff
CopyEdit
----------------------------------
A1 | DBMS | AI Journal
A1 | DBMS | CS Review
✅ After 4NF
markdown
CopyEdit
-----------------------------
A1 | DBMS
-------------------------------------
A1 | AI Journal
A1 | CS Review
You said:
No . continue content
ChatGPT said:
After achieving BCNF, higher normalization forms 4NF, 5NF, and DKNF are used to further
eliminate redundancy and anomalies.
📌 Definition:
A table is in 4NF if:
1. It is in BCNF.
A1 DBMS AI Journal
A1 DBMS CS Review
A1 AI AI Journal
A1 AI CS Review
📌 Problem:
✅ After 4NF
Author_Books Table
Author_ID Book
A1 DBMS
A1 AI
Author_Magazines Table
Author_ID Magazine
A1 AI Journal
A1 CS Review
📌 Definition:
A table is in 5NF if:
1. It is in 4NF.
📍 Join Dependency:
If a relation needs to be split into multiple tables to remove redundancy but can only
be correctly reassembled by joining all tables together, it has a join dependency.
📌 Problem:
✅ After 5NF
Employee_Skill Table
Emp_ID Skill
101 Python
101 Java
Emp_ID Skill
102 Python
102 Java
Project_Skill Table
Project Skill
AI Dev Python
HR App Java
Employee_Project Table
Emp_ID Project
101 AI Dev
101 HR App
102 AI Dev
102 HR App
📌 Definition:
A table is in DKNF if:
1. It is in 5NF.
📍 Practical Use:
✅ Example:
If a rule like "Salary must be > 0" is stored in the database schema itself, the table is
DKNF-compliant.
sql
CopyEdit
Name VARCHAR(50),
);
sql
CopyEdit
Stud_ID INT,
Course VARCHAR(50),
);
sql
CopyEdit
Course_Name VARCHAR(50)
);
sql
CopyEdit
Dept_Name VARCHAR(50),
Location VARCHAR(50)
);
Higher normal forms (BCNF, 4NF, 5NF, DKNF) ensure maximum data integrity.
✅ Final Thought:
Would you like to implement a database with different normalization levels? 😊