0% found this document useful (0 votes)
7 views

Functional dependence and Data normalization

Functional dependency in databases describes the relationship where the value of one set of attributes uniquely determines the value of another set. It is crucial for maintaining data integrity and consistency, with types including full, partial, and transitive functional dependencies. Data normalization organizes data to reduce redundancy and dependency, ensuring integrity through various normal forms, from 1NF to 5NF.

Uploaded by

jkusekwa01
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Functional dependence and Data normalization

Functional dependency in databases describes the relationship where the value of one set of attributes uniquely determines the value of another set. It is crucial for maintaining data integrity and consistency, with types including full, partial, and transitive functional dependencies. Data normalization organizes data to reduce redundancy and dependency, ensuring integrity through various normal forms, from 1NF to 5NF.

Uploaded by

jkusekwa01
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

FUNCTIONAL DEPENDENCY, WHAT IS IT?

In the context of databases, a functional dependency is a relationship between two sets of attributes
(columns) in a relation (table), where knowing the value of one set of attributes uniquely determines the
value of another set of attributes.
NB: In other words, a functional dependency describes the relationship between the values of certain
columns in a table.

Let's consider a simple table to explain functional dependency:

Student_ID Student_Name Course_Code Course_Title


1 Alice CS101 Introduction to Computer Science
2 Bob CS102 Data Structures
3 Charlie CS101 Introduction to Computer Science
4 Dave CS103 Database Management

In this table, we have information about students (identified by Student_ID and Student_Name) and the
courses they are enrolled in (identified by Course_Code and Course_Title).

Now, let's consider some functional dependencies:

1. Student_ID → Student_Name:
Knowing the Student_ID uniquely determines the Student_Name. For example, if Student_ID is 1, the
corresponding Student_Name is Alice. This dependency ensures that each student ID is associated with
only one student name.
2. Course_Code → Course_Title:
Knowing the Course_Code uniquely determines the Course_Title. For example, if Course_Code is
CS101, the corresponding Course_Title is Introduction to Computer Science. This dependency ensures
that each course code is associated with only one course title.
3. (Student_ID, Course_Code) → Student_Name:
Knowing both the Student_ID and Course_Code uniquely determines the Student_Name. For example,
if Student_ID is 1 and Course_Code is CS101, the corresponding Student_Name is Alice. This
dependency ensures that each combination of student ID and course code is associated with only one student
name.
NB: Functional dependencies are crucial for maintaining data integrity and consistency in databases. They
help ensure that the data is stored in a structured manner, without redundancies or inconsistencies, and
facilitate efficient querying and manipulation of data.

The types of functional dependency in database normalization are:

1. Full Functional Dependency


2. Partial Functional Dependency
3. Transitive Functional Dependency

FULL FUNCTIONAL DEPENDENCY:


A full functional dependency occurs when an attribute (or set of attributes) is functionally dependent on
the entire primary key, and not on any subset of that key.
Let's consider the attribute Student_Name in relation to the primary key (Student_ID, Course_Code). If
the Student_Name depends solely on the combination of Student_ID and Course_Code, and not on either
Student_ID or Course_Code individually, then it exhibits a full functional dependency.

In our example, the attribute Student_Name is fully functionally dependent on the primary key
(Student_ID, Course_Code). Each combination of Student_ID and Course_Code uniquely determines
the Student_Name. For instance, knowing both Student_ID and Course_Code allows us to uniquely
identify the corresponding Student_Name.

PARTIAL FUNCTIONAL DEPENDENCY:

A partial functional dependency occurs when an attribute (or set of attributes) is functionally dependent
on only a part of the primary key, and not on the entire primary key.

Consider the attribute Course_Title in relation to the primary key (Student_ID, Course_Code). If the
Course_Title depends only on Course_Code and not on Student_ID, then it exhibits a partial functional
dependency.

In our example, the attribute Course_Title is partially functionally dependent on the primary key
(Student_ID, Course_Code). Each Course_Code uniquely determines the Course_Title, irrespective of
the corresponding Student_ID. Therefore, Course_Title depends partially on the primary key, as it is not
fully dependent on both Student_ID and Course_Code.

TRANSITIVE FUNCTIONAL DEPENDENCY:

• A transitive functional dependency occurs when an attribute (or set of attributes) is


functionally dependent on another non-key attribute, which in turn depends on the primary
key.
• In other words, the value of the dependent attribute(s) is indirectly determined by the
primary key through another non-key attribute.
• Transitive functional dependencies can lead to data anomalies and are generally
undesirable in database design.

Let's consider an example to illustrate transitive dependency using the table provided earlier:

Student_ID Student_Name Course_Code Course_Title Department


1 Alice CS101 Introduction to Computer Science IT
2 Bob CS102 Data Structures Computer Science
3 Charlie CS101 Introduction to Computer Science IT
4 Dave CS103 Database Management Information Systems

In this example, Department is functionally dependent on Course_Code, as each Course_Code uniquely


determines the corresponding Department. However, Course_Code itself is dependent on the primary key
(Student_ID, Course_Code). Therefore, Department indirectly depends on the primary key through the
attribute Course_Code.

This scenario represents a transitive dependency: Department depends transitively on the primary key
(Student_ID, Course_Code) via the non-key attribute Course_Code.

DATA NORMALIZATION
Data normalization is the process of organizing data in a database efficiently. It involves structuring the
data in such a way that reduces redundancy and dependency. This helps in minimizing data anomalies and
ensures that each piece of data is stored only once, which enhances data integrity and simplifies database
management.

TYPES OF DATA NORMALIZATION


There are several types of data normalization, each representing a different level of normalization:

1. First Normal Form (1NF):


• Ensures that each table has a primary key and that the data in each column is atomic
(indivisible).
2. Second Normal Form (2NF):
• Ensures that each non-key attribute is fully functionally dependent on the entire primary
key, eliminating partial dependencies.
3. Third Normal Form (3NF):
• Ensures that there are no transitive dependencies, where a non-key attribute depends on
another non-key attribute, which in turn depends on the primary key.
4. Boyce-Codd Normal Form (BCNF):
• A stricter version of 3NF where every determinant (candidate key) must be a candidate
key.
5. Fourth Normal Form (4NF):
• Ensures that there are no multi-valued dependencies, where one non-key attribute
determines another non-key attribute.
6. Fifth Normal Form (5NF):
• Ensures that every join dependency is implied by the candidate keys.

These levels of normalization progressively eliminate data redundancy and dependency anomalies,
ensuring data integrity and facilitating efficient database management.

1. First Normal Form (1NF):


• Each attribute must have atomic values (no multi-valued attributes).
• No repeating groups within rows.
• No repeating groups across rows.
• Each cell must contain a single value.
2. Second Normal Form (2NF):
• Meet all the requirements of 1NF.
• No partial dependencies: Non-prime attributes are fully functionally dependent on
the entire primary key.
• No redundant data: Attributes should not be repeated in different rows for the
same entity.
3. Third Normal Form (3NF):
• Meet all the requirements of 2NF.
• No transitive dependencies: Non-prime attributes are not functionally dependent
on other non-prime attributes.
4. Boyce-Codd Normal Form (BCNF):
• Meet all the requirements of 3NF.
• Every determinant (candidate key) must be a candidate key (no non-trivial
functional dependencies on proper subsets of candidate keys).
5. Fourth Normal Form (4NF):
• Meet all the requirements of BCNF.
• No multi-valued dependencies: No non-trivial dependency between candidate
keys and non-prime attributes.
6. Fifth Normal Form (5NF):
• Meet all the requirements of 4NF.
• Every join dependency is implied by the candidate keys.

Adhering to these rules ensures that the database is free from data anomalies such as insertion,
deletion, and update anomalies, and maintains data integrity and consistency.

To normalize a table to the first normal form (1NF)

we need to ensure that each attribute contains atomic values (no multi-valued attributes) and that
there are no repeating groups within rows. Let's normalize a hypothetical table to 1NF:
Consider the following table, which contains information about students and the courses they are
enrolled in:

Student_ID Student_Name Courses


1 Alice Math, Science, English
2 Bob History, Math
3 Charlie Science
4 Dave English, History

In this table:

• Student_ID is the primary key.


• Student_Name contains atomic values.
• However, the Courses column violates 1NF because it contains multiple values separated by
commas, indicating a multi-valued attribute.

To normalize this table to 1NF, we need to split the multi-valued attribute Courses into separate rows, each
with a single course for each student:

Student_ID Student_Name Course


1 Alice Math
1 Alice Science
1 Alice English
2 Bob History
2 Bob Math
3 Charlie Science
4 Dave English
4 Dave History

Now, each attribute contains atomic values, and there are no repeating groups within rows. The table is in
the first normal form (1NF).

TO NORMALIZE THE TABLE TO THE SECOND NORMAL FORM (2NF):

we need to ensure that there are no partial dependencies. This means that each non-key attribute should be
fully functionally dependent on the entire primary key.

In our current table, Student_ID is the primary key, and both Student_Name and Course depend on it.
Therefore, the table is already in 2NF.
However, we can illustrate the process of achieving 2NF by adding a separate table for courses:

Students Table:

Student_ID Student_Name
1 Alice
2 Bob
3 Charlie
4 Dave

Courses Table:

Student_ID Course
1 Math
1 Science
1 English
2 History
2 Math
3 Science
4 English
4 History

In this arrangement:

• The Students table contains information about students, with Student_ID as the primary key.
• The Courses table contains information about courses, with Student_ID as a foreign key
referencing the Student_ID in the Students table, forming a one-to-many relationship between
students and courses.
• Each non-key attribute (Student_Name in the Students table and Course in the Courses table) is
fully functionally dependent on the entire primary key (Student_ID).

NB:Therefore, both tables are in the second normal form (2NF).

TO NORMALIZE THE TABLE TO THE THIRD NORMAL FORM (2NF):


Let's consider a hypothetical table containing information about employees and the projects they are
assigned to:

Original Table:

Employee_ID Employee_Name Project_ID Project_Name Department


1 Alice 101 Project_A HR
1 Alice 102 Project_B HR
2 Bob 101 Project_A Finance
3 Charlie 102 Project_B IT
3 Charlie 103 Project_C IT

In this table:

• Employee_ID and Project_ID together form the composite primary key.


• Employee_Name and Department are attributes dependent on Employee_ID.
• Project_Name is dependent on Project_ID.

To normalize this table to the third normal form (3NF), we need to eliminate transitive dependencies.
Specifically, Department depends on Employee_ID, which indirectly depends on the primary key through
Employee_Name. We need to create separate tables for employees and departments.

Employees Table:

Employee_ID Employee_Name
1 Alice
2 Bob
3 Charlie

Projects Table:

Project_ID Project_Name
101 Project_A
102 Project_B
103 Project_C

Departments Table:

Employee_ID Department
1 HR
2 Finance
3 IT
In this arrangement:

• The Employees table contains information about employees, with Employee_ID as the primary
key.
• The Projects table contains information about projects, with Project_ID as the primary key.
• The Departments table contains information about departments, with Employee_ID as a foreign
key referencing the Employee_ID in the Employees table, forming a one-to-many relationship
between employees and departments.
• Each non-key attribute (Employee_Name in the Employees table, Project_Name in the Projects
table, and Department in the Departments table) is fully functionally dependent on the entire
primary key of its respective table.

NB: Therefore, all tables are in the third normal form (3NF).

You might also like