0% found this document useful (0 votes)
2 views14 pages

2 - Normalization in DBMS

The document provides a comprehensive overview of normalization in database management systems (DBMS), detailing its definition, goals, and types of anomalies it addresses. It outlines various normalization forms from 1NF to DKNF, explaining the rules and examples for each form to illustrate how to eliminate redundancy and improve data integrity. The document concludes with a summary of practical implementation steps for normalization.

Uploaded by

dinesh raja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views14 pages

2 - Normalization in DBMS

The document provides a comprehensive overview of normalization in database management systems (DBMS), detailing its definition, goals, and types of anomalies it addresses. It outlines various normalization forms from 1NF to DKNF, explaining the rules and examples for each form to illustrate how to eliminate redundancy and improve data integrity. The document concludes with a summary of practical implementation steps for normalization.

Uploaded by

dinesh raja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

📘 Normalization in DBMS – Detailed Explanation with Examples

📌 Lecture Outline

1. Definition of Normalization

2. Goal of Normalization

3. Types of Anomalies (Update, Deletion, Insert)

4. Different Normalization Forms

o 1NF (First Normal Form)

o 2NF (Second Normal Form)

o 3NF (Third Normal Form)

o BCNF (Boyce-Codd Normal Form)

o 4NF (Fourth Normal Form)

o 5NF (Fifth Normal Form)

o DKNF (Domain-Key Normal Form)

5. Conclusion and Q&A

📌 1. Definition of Normalization

📌 Definition:
Normalization is the process of organizing data in a relational database to reduce
redundancy and improve data integrity by breaking a larger table into smaller related
tables.

✅ Why Normalize?

 Removes data redundancy.

 Reduces insertion, deletion, and update anomalies.

 Ensures data integrity and consistency.

📌 2. Goal of Normalization

📍 The main objectives of normalization are: ✅ Eliminate Redundant Data – Reduce data
duplication in multiple places.
✅ Organize Data Logically – Store data in smaller, well-structured tables.
✅ Improve Database Performance – Minimize disk space usage and optimize queries.
✅ Ensure Data Integrity – Prevent inconsistencies caused by anomalies.
✅ Make Data More Flexible – Easy to update without affecting other records.

📌 3. Types of Anomalies in Databases

When a table is not normalized, it leads to data anomalies, which affect the correctness of
the database.

🔹 A. Update Anomalies

📌 Occurs when a change in one record requires multiple updates in different places.

✅ Example:

pgsql

CopyEdit

Employee (Emp_ID, Name, Dept, Dept_Location)

---------------------------------------------

101 | Alice | IT | New York

102 | Bob | HR | Los Angeles

103 | Alice | IT | New York <-- Duplicate Entry

❌ Problem: If IT department moves to Chicago, we must update multiple rows!

✅ Solution: Split into Employee and Department tables.

🔹 B. Deletion Anomalies

📌 Occurs when deleting a record removes useful data unintentionally.

✅ Example:

markdown

CopyEdit

Project (Proj_ID, Proj_Name, Dept, Employee)

---------------------------------------------

P1 | AI Dev | IT | Alice

P2 | HR App | HR | Bob
❌ Problem: If Project P2 is deleted, we lose HR department info.

✅ Solution: Separate into Project and Department tables.

🔹 C. Insert Anomalies

📌 Occurs when inserting data is difficult due to missing values.

✅ Example:

pgsql

CopyEdit

Student (Stud_ID, Name, Course, Instructor)

---------------------------------------------

101 | Alice | DBMS | Prof. John

❌ Problem: Cannot add a new instructor without at least one student enrolled.

✅ Solution: Split into Student and Instructor tables.

📌 4. Different Normalization Forms

Normalization is done in progressive steps from 1NF → 5NF to ensure a fully optimized
database.

🔹 1NF: First Normal Form (Eliminate Duplicate Columns)

📌 Rules for 1NF:

1. Each column should have atomic values (no multiple values in one cell).

2. Each row must be unique (should have a primary key).

✅ Example (Before 1NF - Unnormalized)

pgsql

CopyEdit

Student (Stud_ID, Name, Courses)

---------------------------------

101 | Alice | DBMS, AI


102 | Bob | AI

❌ Problem: Multiple values in "Courses" column.

✅ After 1NF

pgsql

CopyEdit

Student (Stud_ID, Name, Course)

--------------------------------

101 | Alice | DBMS

101 | Alice | AI

102 | Bob | AI

🎯 Now each column has atomic values!

🔹 2NF: Second Normal Form (Remove Partial Dependency)

📌 Rules for 2NF:

1. Meet 1NF requirements.

2. Remove partial dependencies (A part of the primary key should not determine a
non-key attribute).

✅ Example (Before 2NF)

diff

CopyEdit

Enrollment (Stud_ID, Course_ID, Course_Name)

--------------------------------------------

101 | C1 | DBMS

101 | C2 | AI

❌ Problem: Course_Name depends only on Course_ID, not Stud_ID + Course_ID.

✅ After 2NF

pgsql

CopyEdit
Student (Stud_ID, Name)

------------------------

101 | Alice

102 | Bob

Course (Course_ID, Course_Name)

--------------------------------

C1 | DBMS

C2 | AI

Enrollment (Stud_ID, Course_ID)

--------------------------------

101 | C1

101 | C2

102 | C2

🎯 Now, no partial dependencies!

🔹 3NF: Third Normal Form (Remove Transitive Dependency)

📌 Rules for 3NF:

1. Meet 2NF requirements.

2. Remove transitive dependencies (Non-key attributes should not depend on other


non-key attributes).

✅ Example (Before 3NF)

markdown

CopyEdit

Employee (Emp_ID, Name, Dept, Dept_Location)

--------------------------------------------

101 | Alice | IT | New York


102 | Bob | HR | Los Angeles

❌ Problem: Dept_Location depends on Dept, not Emp_ID.

✅ After 3NF

pgsql

CopyEdit

Employee (Emp_ID, Name, Dept)

-----------------------------

101 | Alice | IT

102 | Bob | HR

Department (Dept, Dept_Location)

--------------------------------

IT | New York

HR | Los Angeles

🎯 Now, no transitive dependencies!

🔹 BCNF (Boyce-Codd Normal Form)

📌 Rules for BCNF:

1. Meet 3NF requirements.

2. Every determinant must be a candidate key.

✅ Example (Before BCNF)

diff

CopyEdit

Professor (Prof_ID, Course, Dept)

---------------------------------

P1 | DBMS | CS

P2 | AI | CS

❌ Problem: Course → Dept, but Course is not a primary key.


✅ After BCNF

diff

CopyEdit

Professor (Prof_ID, Course)

---------------------------

P1 | DBMS

P2 | AI

Course (Course, Dept)

---------------------

DBMS | CS

AI | CS

🎯 Now, Course is properly stored in a separate table.

🔹 4NF (Fourth Normal Form - Remove Multivalued Dependencies)

📌 Rules for 4NF:

1. Meet BCNF requirements.

2. Remove multivalued dependencies.

✅ Example (Before 4NF)

diff

CopyEdit

Author (Author_ID, Book, Magazine)

----------------------------------

A1 | DBMS | AI Journal

A1 | DBMS | CS Review

❌ Problem: Book and Magazine are independent.

✅ After 4NF

markdown
CopyEdit

Author_Book (Author_ID, Book)

-----------------------------

A1 | DBMS

Author_Magazine (Author_ID, Magazine)

-------------------------------------

A1 | AI Journal

A1 | CS Review

🎯 Now, each independent data is stored separately!

🔹 5NF (Fifth Normal Form - Remove Join Dependencies)

📌 Rules for 5NF:

 Ensures data cannot be further decomposed without losing meaning.

 Eliminates join dependencies.

✅ Example: Used in complex multi-relational models like supply chain databases.

🔹 DKNF (Domain-Key Normal Form)

📌 Rules for DKNF:

 Ensures all constraints are enforced at the database level.

 No further normalization is needed.

✅ Would you like SQL queries for each normalization step? 😊

You said:

No . continue content

ChatGPT said:

📘 Normalization in DBMS – Continued Lecture Content


📌 5. Higher Normal Forms (4NF, 5NF, DKNF)

After achieving BCNF, higher normalization forms 4NF, 5NF, and DKNF are used to further
eliminate redundancy and anomalies.

🔹 4NF: Fourth Normal Form (Eliminate Multivalued Dependencies)

📌 Definition:
A table is in 4NF if:

1. It is in BCNF.

2. It has no multivalued dependencies (MVDs).

📍 Multivalued Dependency (MVD):

 If A →→ B (A determines multiple values of B), then B must be stored separately.

✅ Example (Before 4NF - Multivalued Dependency)

Author_ID Book Magazine

A1 DBMS AI Journal

A1 DBMS CS Review

A1 AI AI Journal

A1 AI CS Review

📌 Problem:

 Author_ID →→ Book (One author writes multiple books).

 Author_ID →→ Magazine (One author subscribes to multiple magazines).

 Books and Magazines are independent.

✅ After 4NF

 Split into two separate tables:

Author_Books Table

Author_ID Book

A1 DBMS

A1 AI

Author_Magazines Table
Author_ID Magazine

A1 AI Journal

A1 CS Review

🎯 Now, no unnecessary duplication!

🔹 5NF: Fifth Normal Form (Eliminate Join Dependency)

📌 Definition:
A table is in 5NF if:

1. It is in 4NF.

2. No join dependencies exist.

📍 Join Dependency:

 If a relation needs to be split into multiple tables to remove redundancy but can only
be correctly reassembled by joining all tables together, it has a join dependency.

✅ Example (Before 5NF - Join Dependency)

Emp_ID Project Skill

101 AI Dev Python

101 HR App Java

102 AI Dev Python

102 HR App Java

📌 Problem:

 Employee skills and project assignments are independent.

 Splitting into two tables will lose information.

✅ After 5NF

 Split into three separate tables:

Employee_Skill Table

Emp_ID Skill

101 Python

101 Java
Emp_ID Skill

102 Python

102 Java

Project_Skill Table

Project Skill

AI Dev Python

HR App Java

Employee_Project Table

Emp_ID Project

101 AI Dev

101 HR App

102 AI Dev

102 HR App

🎯 Now, no redundant data, and the original table can be reconstructed!

🔹 DKNF: Domain-Key Normal Form

📌 Definition:
A table is in DKNF if:

1. It is in 5NF.

2. Every constraint and dependency is enforced by domain or key constraints.

📍 Practical Use:

 DKNF is theoretical and rarely implemented.

 Ensures no constraints outside keys and domains.

✅ Example:

 If a rule like "Salary must be > 0" is stored in the database schema itself, the table is
DKNF-compliant.

📌 6. Summary Table for All Normal Forms


Normalization Rule Example Fix

Remove duplicate columns, ensure


1NF Separate repeating groups into new rows
atomicity

Move attributes dependent on part of PK


2NF Remove partial dependencies
to another table

Move non-key dependent attributes to a


3NF Remove transitive dependencies
new table

Every determinant must be a Ensure no anomalies from overlapping


BCNF
candidate key FDs

Remove multivalued Split independent multi-valued attributes


4NF
dependencies into separate tables

Decompose table further without losing


5NF Remove join dependencies
data

Every constraint enforced through


DKNF Store all constraints at schema level
domain or key

📌 7. Practical Implementation of Normalization

✅ Example: SQL Queries for Normalization

Step 1: Create an Unnormalized Table

sql

CopyEdit

CREATE TABLE Student (

Stud_ID INT PRIMARY KEY,

Name VARCHAR(50),

Courses VARCHAR(100) -- Not atomic (violates 1NF)

);

❌ Problem: Courses has multiple values (DBMS, AI).

Step 2: Convert to 1NF

sql
CopyEdit

CREATE TABLE Student_Course (

Stud_ID INT,

Course VARCHAR(50),

PRIMARY KEY (Stud_ID, Course),

FOREIGN KEY (Stud_ID) REFERENCES Student(Stud_ID)

);

✅ Each column now has atomic values!

Step 3: Convert to 2NF

sql

CopyEdit

CREATE TABLE Course (

Course_ID INT PRIMARY KEY,

Course_Name VARCHAR(50)

);

✅ Course information moved to a separate table to remove partial dependency!

Step 4: Convert to 3NF

sql

CopyEdit

CREATE TABLE Department (

Dept_ID INT PRIMARY KEY,

Dept_Name VARCHAR(50),

Location VARCHAR(50)

);

✅ Now, Dept_Location is stored in a separate Department table!


📌 8. Conclusion

 Normalization ensures an efficient, optimized, and anomaly-free database.

 Each step eliminates different types of redundancy and anomalies.

 Higher normal forms (BCNF, 4NF, 5NF, DKNF) ensure maximum data integrity.

✅ Final Thought:
Would you like to implement a database with different normalization levels? 😊

You might also like