Database Management Systems –
Normalization
🔶 Introduction to Normalization
Normalization is a database design technique used to minimize data redundancy and eliminate
anomalies in relational databases.
📌 The goal of normalization is to organize data into well-structured tables to ensure data
consistency, integrity, and efficient querying.
🔷 Why Normalize a Database?
✅ Advantages:
Reduces data redundancy
Eliminates update, insert, and delete anomalies
Improves data integrity
Makes maintenance and updates easier
❌ Disadvantages (if overdone):
May require complex joins in queries
Can reduce read performance in highly normalized databases
🔶 Types of Anomalies (Problems in Unnormalized Data)
Anomaly Type Description
Insert Unable to add data due to missing other related data
Update Changes require multiple rows to be updated (risk of mismatch)
Delete Deleting one item removes related useful data
🔷 The Normal Forms (1NF to 5NF)
Normalization involves applying rules called Normal Forms. Each level builds upon the
previous one.
🔹 1NF – First Normal Form
Rule:
Atomic (indivisible) values only
No repeating groups
Violation Example:
StudentID Name Courses
001 Alice Math, Physics
✅ Fix:
Split multivalued fields into separate rows.
StudentID Name Course
001 Alice Math
001 Alice Physics
🔹 2NF – Second Normal Form
Rule:
Be in 1NF
No partial dependency (non-prime attributes should depend on the whole primary key)
🔍 Applies to composite primary keys
Violation Example:
StudentID Course InstructorName
001 Math Mr. Smith
InstructorName depends on Course only, not (StudentID, Course)
✅ Fix:
Separate the Course-Instructor relation into a new table.
🔹 3NF – Third Normal Form
Rule:
Be in 2NF
No transitive dependency (non-key depends on another non-key)
Violation Example:
StudentID Name DeptID DeptName
DeptName depends on DeptID, which is not the primary key.
✅ Fix:
Separate Department info into its own table.
🔹 BCNF – Boyce-Codd Normal Form
Rule:
Even stronger than 3NF
Every determinant must be a candidate key
Violation Example:
Course Instructor Room
If one instructor teaches only one course, but multiple instructors teach in the same room —
Room depends on Instructor, not Course.
✅ Fix:
Break into two tables.
🔹 4NF – Fourth Normal Form
Rule:
Be in BCNF
No multi-valued dependencies
Violation Example:
Student Hobby Language
Alice Painting English
Alice Painting French
Alice Cycling English
Alice Cycling French
✅ Fix:
Separate hobbies and languages into different tables.
🔹 5NF – Fifth Normal Form (Project-Join Normal Form)
Rule:
Decompose tables only when lossless join is guaranteed
No join dependency violations
⚠️Rarely used in practice
🔷 Summary Table of Normal Forms
Normal Form Key Requirement
1NF Atomic values, no repeating groups
2NF No partial dependency (composite keys only)
3NF No transitive dependency
BCNF Determinants must be candidate keys
4NF No multi-valued dependency
5NF Lossless-join decomposition
🔶 Denormalization
In some cases, data is intentionally denormalized to improve performance, especially for
reporting or read-heavy systems.
✅ Use denormalization when:
Queries require too many joins
Read performance is more critical than write consistency
📝 Common Questions
Q1: What's the difference between 3NF and BCNF?
All BCNF is 3NF, but not all 3NF is BCNF. BCNF handles more edge cases where non-prime
attributes are still candidate keys.
Q2: Is normalization always necessary?
No. OLAP systems (data warehouses) often use denormalized structures (star/snowflake
schemas).
Q3: Can a table be in 2NF and not in 1NF?
No. Each form builds on the previous; 1NF is always the first step.