Normalization in SQL
Normalization in SQL
NORMALIZATION
Normalization in SQL
Levels of Normalization
There are several levels of normalization, each building upon the previous one.
Example:
Explanation:
The Courses column in the unnormalized table contains multiple values (comma-
separated).
To bring it into 1NF, we split the data into multiple rows, ensuring that each
column contains atomic values.
Second Normal Form (2NF)
It is in 1NF.
There is no partial dependency, meaning non-key attributes are fully dependent
on the primary key.
Partial dependency occurs when a non-key column is dependent on only part of the
primary key (in the case of a composite primary key).
Example:
1 101 A
1 102 B
2 101 A
1. Courses Table (Primary Key: Course_ID):
Course_ID Instructor
Explanation:
The Instructor is dependent on the Course_ID, not the whole composite primary
key (Student_ID, Course_ID).
So, we separate the instructor information into a different table, ensuring that the
non-key attribute Instructor is fully dependent on the entire primary key
(Course_ID).
Third Normal Form (3NF)
It is in 2NF.
There is no transitive dependency, meaning non-key attributes are not dependent
on other non-key attributes.
Example:
1 John Doe 1
2 Jane Smith 2
1 HR Building A
2 IT Building B
Explanation:
It is in 3NF.
For every non-trivial functional dependency, the left-hand side (the determinant)
is a superkey.
Example:
Student_ID Course_ID
1 101
1 102
2 101
1. Courses Table (Primary Key: Course_ID):
Course_ID Instructor
Explanation:
The original table violated BCNF because Instructor depended only on Course_ID,
not on the entire primary key.
We separated the Instructor into a Courses table, ensuring that every non-key
attribute is fully dependent on a superkey.
Fourth Normal Form (4NF)
It is in BCNF.
It has no multi-valued dependencies, meaning no column can contain multiple
independent values for the same row.
Example:
1 Math Painting
1 English Reading
1. Student_Courses Table:
Student_ID Course
1 Math
1 English
1. Student_Hobbies Table:
Student_ID Hobby
1 Painting
1 Reading
Explanation:
In the original table, a student could have multiple hobbies and courses, leading
to redundancy.
We separate hobbies and courses into different tables to avoid multi-valued
dependencies.
Fifth Normal Form (5NF)
It is in 4NF.
It does not contain any join dependency, i.e., data can only be reconstructed
using joins between smaller tables.
Normal
Criteria
Form
1. Reduces Data Redundancy: Prevents the same data from being stored in multiple
places.
2. Improves Data Integrity: Ensures that data is consistent across the database.
3. Efficient Updates: Changes to data are easier since there is no duplication of data.
4. Faster Queries: Smaller, more specific tables can speed up query performance.
Drawbacks of Normalization:
Complex Queries: More joins might be required, making queries more complex.
Performance Impact: Excessive normalization might lead to performance
degradation in certain cases, especially when dealing with large datasets.
Conclusion: