Functional dependence and Data normalization
Functional dependence and Data normalization
In the context of databases, a functional dependency is a relationship between two sets of attributes
(columns) in a relation (table), where knowing the value of one set of attributes uniquely determines the
value of another set of attributes.
NB: In other words, a functional dependency describes the relationship between the values of certain
columns in a table.
In this table, we have information about students (identified by Student_ID and Student_Name) and the
courses they are enrolled in (identified by Course_Code and Course_Title).
1. Student_ID → Student_Name:
Knowing the Student_ID uniquely determines the Student_Name. For example, if Student_ID is 1, the
corresponding Student_Name is Alice. This dependency ensures that each student ID is associated with
only one student name.
2. Course_Code → Course_Title:
Knowing the Course_Code uniquely determines the Course_Title. For example, if Course_Code is
CS101, the corresponding Course_Title is Introduction to Computer Science. This dependency ensures
that each course code is associated with only one course title.
3. (Student_ID, Course_Code) → Student_Name:
Knowing both the Student_ID and Course_Code uniquely determines the Student_Name. For example,
if Student_ID is 1 and Course_Code is CS101, the corresponding Student_Name is Alice. This
dependency ensures that each combination of student ID and course code is associated with only one student
name.
NB: Functional dependencies are crucial for maintaining data integrity and consistency in databases. They
help ensure that the data is stored in a structured manner, without redundancies or inconsistencies, and
facilitate efficient querying and manipulation of data.
In our example, the attribute Student_Name is fully functionally dependent on the primary key
(Student_ID, Course_Code). Each combination of Student_ID and Course_Code uniquely determines
the Student_Name. For instance, knowing both Student_ID and Course_Code allows us to uniquely
identify the corresponding Student_Name.
A partial functional dependency occurs when an attribute (or set of attributes) is functionally dependent
on only a part of the primary key, and not on the entire primary key.
Consider the attribute Course_Title in relation to the primary key (Student_ID, Course_Code). If the
Course_Title depends only on Course_Code and not on Student_ID, then it exhibits a partial functional
dependency.
In our example, the attribute Course_Title is partially functionally dependent on the primary key
(Student_ID, Course_Code). Each Course_Code uniquely determines the Course_Title, irrespective of
the corresponding Student_ID. Therefore, Course_Title depends partially on the primary key, as it is not
fully dependent on both Student_ID and Course_Code.
Let's consider an example to illustrate transitive dependency using the table provided earlier:
This scenario represents a transitive dependency: Department depends transitively on the primary key
(Student_ID, Course_Code) via the non-key attribute Course_Code.
DATA NORMALIZATION
Data normalization is the process of organizing data in a database efficiently. It involves structuring the
data in such a way that reduces redundancy and dependency. This helps in minimizing data anomalies and
ensures that each piece of data is stored only once, which enhances data integrity and simplifies database
management.
These levels of normalization progressively eliminate data redundancy and dependency anomalies,
ensuring data integrity and facilitating efficient database management.
Adhering to these rules ensures that the database is free from data anomalies such as insertion,
deletion, and update anomalies, and maintains data integrity and consistency.
we need to ensure that each attribute contains atomic values (no multi-valued attributes) and that
there are no repeating groups within rows. Let's normalize a hypothetical table to 1NF:
Consider the following table, which contains information about students and the courses they are
enrolled in:
In this table:
To normalize this table to 1NF, we need to split the multi-valued attribute Courses into separate rows, each
with a single course for each student:
Now, each attribute contains atomic values, and there are no repeating groups within rows. The table is in
the first normal form (1NF).
we need to ensure that there are no partial dependencies. This means that each non-key attribute should be
fully functionally dependent on the entire primary key.
In our current table, Student_ID is the primary key, and both Student_Name and Course depend on it.
Therefore, the table is already in 2NF.
However, we can illustrate the process of achieving 2NF by adding a separate table for courses:
Students Table:
Student_ID Student_Name
1 Alice
2 Bob
3 Charlie
4 Dave
Courses Table:
Student_ID Course
1 Math
1 Science
1 English
2 History
2 Math
3 Science
4 English
4 History
In this arrangement:
• The Students table contains information about students, with Student_ID as the primary key.
• The Courses table contains information about courses, with Student_ID as a foreign key
referencing the Student_ID in the Students table, forming a one-to-many relationship between
students and courses.
• Each non-key attribute (Student_Name in the Students table and Course in the Courses table) is
fully functionally dependent on the entire primary key (Student_ID).
Original Table:
In this table:
To normalize this table to the third normal form (3NF), we need to eliminate transitive dependencies.
Specifically, Department depends on Employee_ID, which indirectly depends on the primary key through
Employee_Name. We need to create separate tables for employees and departments.
Employees Table:
Employee_ID Employee_Name
1 Alice
2 Bob
3 Charlie
Projects Table:
Project_ID Project_Name
101 Project_A
102 Project_B
103 Project_C
Departments Table:
Employee_ID Department
1 HR
2 Finance
3 IT
In this arrangement:
• The Employees table contains information about employees, with Employee_ID as the primary
key.
• The Projects table contains information about projects, with Project_ID as the primary key.
• The Departments table contains information about departments, with Employee_ID as a foreign
key referencing the Employee_ID in the Employees table, forming a one-to-many relationship
between employees and departments.
• Each non-key attribute (Employee_Name in the Employees table, Project_Name in the Projects
table, and Department in the Departments table) is fully functionally dependent on the entire
primary key of its respective table.
NB: Therefore, all tables are in the third normal form (3NF).