0% found this document useful (0 votes)
4 views

Database Normalization

Notes for information management IT

Uploaded by

crystaljhoyl
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Database Normalization

Notes for information management IT

Uploaded by

crystaljhoyl
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

DATABASE NORMALIZATION

Database normalization is a systematic approach to organizing a database


to reduce redundancy and improve data integrity. The process involves
decomposing tables into smaller, related tables without losing data.
Normalization is typically carried out in several stages, known as Normal
Forms (NFs). Each normal form has specific requirements that a table must
meet before moving to the next form.

Below, we'll explore the first three normal forms—First Normal Form (1NF),
Second Normal Form (2NF), and Third Normal Form (3NF)—with
examples to illustrate each concept. We'll also touch briefly on higher
normal forms like Boyce-Codd Normal Form (BCNF).

1. First Normal Form (1NF)


Definition:
A table is in 1NF if:

• It contains only atomic (indivisible) values.


• Each entry in a column is of the same data type.
• Each column has a unique name.
• The order in which data is stored does not matter.

Issues Addressed:
Eliminates repeating groups and ensures that each field contains only one
value.

Example:

Unnormalized Table:

StudentID StudentName Courses


1 Alice Math, Science
2 Bob History, Math
3 Charlie Science

Problems:

• The Courses column contains multiple values (violates atomicity).


Normalized to 1NF:

StudentID StudentName Course


1 Alice Math
1 Alice Science
2 Bob History
2 Bob Math
3 Charlie Science

Benefits:

• Ensures that each field contains only one value.


• Simplifies querying and data manipulation.

2. Second Normal Form (2NF)


Definition:
A table is in 2NF if:

• It is already in 1NF.
• All non-key attributes are fully functionally dependent on the entire
primary key.

Key Terms:

• Primary Key: A unique identifier for a table.


• Functional Dependency: A relationship where one attribute
uniquely determines another.

Issues Addressed:
Removes partial dependencies where non-key attributes depend only on
part of a composite primary key.

Example:

1NF Table with Composite Primary Key:

StudentID Course Instructor InstructorPhone


1 Math Dr. Smith 555-1234
1 Science Dr. Jones 555-5678
2 History Dr. Brown 555-8765
2 Math Dr. Smith 555-1234
3 Science Dr. Jones 555-5678
Primary Key: (StudentID, Course)

Problems:

• and InstructorPhone depend only on Course, not on the entire


Instructor
primary key.

Normalized to 2NF:

Students_Courses Table:

StudentID Course
1 Math
1 Science
2 History
2 Math
3 Science

Courses Table:

Course Instructor InstructorPhone


Math Dr. Smith 555-1234
Science Dr. Jones 555-5678
History Dr. Brown 555-8765

Benefits:

• Eliminates partial dependencies.


• Reduces data redundancy (e.g., instructor information is stored once
per course).

3. Third Normal Form (3NF)


Definition:
A table is in 3NF if:

• It is already in 2NF.
• There are no transitive dependencies; non-key attributes do not
depend on other non-key attributes.

Issues Addressed:
Removes dependencies where non-key attributes depend on other non-
key attributes, not directly on the primary key.
Example:

2NF Table with Transitive Dependency:

StudentID Course Instructor InstructorPhone InstructorEmail


1 Math Dr. Smith 555-1234 [email protected]
1 Science Dr. Jones 555-5678 [email protected]
2 History Dr. Brown 555-8765 [email protected]
2 Math Dr. Smith 555-1234 [email protected]
3 Science Dr. Jones 555-5678 [email protected]

Problems:

• InstructorEmail depends on Instructor, not directly on Course or the


primary key.

Normalized to 3NF:

Students_Courses Table:

StudentID Course
1 Math
1 Science
2 History
2 Math
3 Science

Courses Table:

Course Instructor
Math Dr. Smith
Science Dr. Jones
History Dr. Brown

Instructors Table:

Instructor InstructorPhone InstructorEmail


Dr. Smith 555-1234 [email protected]
Dr. Jones 555-5678 [email protected]
Dr. Brown 555-8765 [email protected]

Benefits:
• Eliminates transitive dependencies.
• Further reduces data redundancy and improves data integrity.

Boyce-Codd Normal Form (BCNF)


Definition:
A table is in BCNF if:

• It is already in 3NF.
• For every functional dependency (X → Y), X is a super key.

Issues Addressed:
Handles certain anomalies that 3NF does not, especially involving
overlapping candidate keys.

Example:

3NF Table Not in BCNF:

Course Instructor Department


Math Dr. Smith Mathematics
Science Dr. Jones Science
History Dr. Brown Humanities
Art Dr. Smith Arts

Functional Dependencies:

• Course → Instructor, Department


• Instructor → Department

Problem:

• Instructor → Department violates BCNF because Instructor is not a super


key.

Normalized to BCNF:

Courses Table:

Course Instructor
Math Dr. Smith
Science Dr. Jones
History Dr. Brown
Course Instructor
Art Dr. Smith

Instructors Table:

Instructor Department
Dr. Smith Mathematics
Dr. Jones Science
Dr. Brown Humanities

Benefits:

• Ensures that every determinant is a super key.


• Eliminates anomalies related to dependencies that aren't handled
by 3NF.

Summary of Normal Forms


Normal
Requirements Purpose
Form
Atomic values, unique column
Eliminate repeating groups
1NF names, single data type per
and ensure atomicity
column
1NF + no partial dependencies on a Remove partial
2NF
composite primary key dependencies
Remove transitive
3NF 2NF + no transitive dependencies
dependencies
3NF + every determinant is a super Handle certain anomalies
BCNF
key beyond 3NF

Benefits of Normalization
1. Reduces Data Redundancy: Minimizes duplicate data, saving
storage and ensuring consistency.
2. Improves Data Integrity: Ensures that data dependencies make
sense, maintaining accuracy.
3. Enhances Query Performance: Smaller, well-structured tables can
improve query efficiency.
4. Facilitates Maintenance: Easier to update, insert, and delete data
without anomalies.
When to Denormalize
While normalization has many benefits, there are scenarios where
denormalization (the process of combining tables) is advantageous, such
as:

• Performance Optimization: Reducing the number of joins can


speed up read operations.
• Simplifying Queries: Fewer tables can make queries easier to write
and understand.
• Specific Use Cases: Data warehouses and reporting systems often
use denormalized structures for efficiency.

Note: Denormalization should be approached carefully to balance


performance gains with potential data redundancy and integrity issues.

Conclusion
Database normalization is a fundamental concept in relational database
design that ensures data is stored efficiently and accurately. By following
the normal forms, you can create a robust database structure that
minimizes redundancy, prevents anomalies, and maintains data integrity.
Understanding and applying normalization principles is essential for
designing scalable and maintainable databases.

You might also like