0% found this document useful (0 votes)
25 views

Module 3 Relational Databse Design

Uploaded by

Shreyash Deotale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Module 3 Relational Databse Design

Uploaded by

Shreyash Deotale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 74

Module 3

Relational Database Design


Contents
• Introduction to Normalization: Third Normal Form (3NF):
• Explanation of the normalization process. Definition of 3NF.
• Understanding the need for normalization. Removing transitive dependencies.
Ensuring all attributes depend only on the primary key.
• Discussing the goals of normalization, such as eliminating Examples and exercises related to achieving 3NF.
redundancy, minimizing data anomalies, and improving data
Boyce-Codd Normal Form (BCNF):
integrity.
Definition of BCNF.
• Functional Dependencies: Understanding the relationship between BCNF and 3NF.
• Understanding the concept of functional dependencies. Examples and exercises related to achieving BCNF.
Normalization Beyond 3NF:
• Identifying and defining primary keys.
Discussion on higher normal forms like 4NF, 5NF, and Domain-Key Normal Form
• Exploring the concept of candidate keys. (DK/NF).
• First Normal Form (1NF): Application and relevance of higher normal forms.
Denormalization:
• Definition of 1NF.
Understanding the concept of denormalization.
• Ensuring atomicity of data by removing repeating groups. When and why denormalization may be necessary.
• Examples and exercises related to achieving 1NF. Considerations and trade-offs in denormalizing a database.
Practical Implementation:
• Second Normal Form (2NF): Applying normalization principles to real-world scenarios.
• Definition of 2NF. Designing normalized database schemas.
• Eliminating partial dependencies. Discussing tools and techniques for implementing normalization.
• Recognizing and creating composite keys.
• Examples and exercises related to achieving 2NF.
A good database design has many benefits and is a goal to
achieve for every DBA:

• Easy Retrieval of Information


• If the design is developed properly, then it would be easier to retrieve information. Correct design
means the tables, constraints, and relationships created are flawless.
• Easier Modification
• Changes that you make to the value of a given field will not adversely affect the values of other fields
within the table.
• Easy to Maintain
• The database structure should be easy to maintain. The design is perfect if changes in one field is not
affecting changes in another field.
• Information
• With a good design, you can enhance the quality and consistency of information.
• Well-designed Database
• If the database is well defined, then the flaws and issues of a poorly designed database will not need to
be addressed.
Different Dependencies

• Functional Dependency (FD)

• Partial Dependency

• Transitive Dependency

• Multi-valued Dependency (MVD)

• Candidate Key Dependency

• Full Functional Dependency


Functional Dependency (FD)

• Functional Dependency (FD) is a concept in database management


that describes a relationship between two sets of attributes in a
relational database.
• It indicates that the values of one set of attributes (the dependent
attributes) are uniquely determined by another set (the determinant
attributes).
• In simpler terms, if you know the value of the determinant attributes,
you can determine the value of the dependent attributes.

Functional Dependency (FD): It's like a rule in a table that


says if you know one thing, you can figure out another thing.
a simple example:
Consider a table representing information about students in a school:

StudentID Name Age Grade


1 Alice 18 A
2 Bob 17 B
3 Charlie 18 A So, the functional dependency "StudentID -> Name" holds true for this
scenario.
In this table, let's say we have a functional dependency A -> B, where
In a formal notation, we would write it as: StudentID→Name
"A" is the determinant attribute set, and "B" is the dependent attribute
set.
Functional dependencies are crucial in the normalization process of
This means that knowing the values of "A" uniquely determines the
database design.
values of "B."
In our example, let's consider the functional dependency "StudentID ->
They help to organize data and ensure that relationships between
Name."
attributes are well-defined, reducing redundancy and improving the
This implies that if you know the StudentID, you can uniquely
integrity of the database.
determine the corresponding Name.
In other words, each student's name is uniquely identified by their
StudentID.
For instance, if we know that StudentID 2 corresponds to Bob, and
StudentID 1 corresponds to Alice, we can say that StudentID uniquely
determines Name in this table.
Partial Dependency
Partial Dependency: It's like saying, "Some info depends on only part of the key, not the whole thing."
Example: Imagine you have a table with information about courses and
instructors: For example, if we know the CourseID (let's say
CourseID Instructor Department 101), we can figure out the Department (Math)
101 Prof. Smith Math without needing to know the Instructor.
102 Prof. Brown Physics
103 Prof. White Chemistry So, the Department depends partially on just the
Suppose we have a composite key (a key made up of more than CourseID.
one attribute), and we notice that one part of it determines In symbols, we write it as: CourseID→Department
something else.
Partial dependencies are important to recognize
Let's say our key is {CourseID, Instructor}, and we observe a because they can lead to issues in database design.
partial dependency like this: {CourseID} -> {Department}.
They signal that our table might not be organized
This means that the Department depends only on part of the key optimally, and we may need to adjust it to avoid
(CourseID), not the whole key (CourseID and Instructor problems down the road.
together).
Transitive Dependency
Transitive Dependency: It's like saying, "If A determines B, and B determines C, then A indirectly
determines C."
Example: Imagine you have a table with information about students and their textbooks:

StudentID StudentName TextbookID TextbookTitle


1 Alice 101 Math Basics
In symbols, we write it as:
2 Bob 102 Physics Fund StudentID→TextbookID→TextbookTitle
3 Charlie 101 Math Basics
Transitive dependencies can lead to inefficiencies in database
design, and recognizing them helps in organizing data more
Now, let's consider a transitive dependency: "StudentID determines effectively to avoid problems like data redundancy.
TextbookTitle.“

Here's the dependency chain:

•StudentID -> TextbookID (because each student has a unique textbook)


•TextbookID -> TextbookTitle (because each textbook ID has a unique title)

So, indirectly, StudentID determines TextbookTitle.


If you know the StudentID (let's say 1), you can find out the TextbookID (101),
and from that, you can determine the TextbookTitle (Math Basics).
Multi-valued Dependency (MVD)

• Multi-valued Dependency (MVD): It's like saying, "When one thing happens, another thing can happen independently."

Example: Imagine you have a table with information about employees and their projects:

EmployeeID EmployeeName ProjectID ProjectName


1 Alice 101 ProjectA So, for example, if you know the EmployeeID (let's say 1), you
1 Alice 102 ProjectB can independently figure out the associated projects (ProjectA and
2 Bob 101 ProjectA
ProjectB).
3 Charlie 103 ProjectC
Multi-valued dependencies are essential to recognize because they
highlight situations where data about one thing (in this case,
Now, let's consider a multi-valued dependency: "EmployeeID projects) doesn't rely on or influence data about another thing
determines ProjectName." within the same table (like different projects for the same
This means that for each employee, there can be multiple projects, and employee). Understanding MVD helps in designing databases
the projects are independent of each other. In other words, knowing one more efficiently to avoid unnecessary data duplication.
project doesn't tell you about the other projects the employee might be
working on.
In symbols, we write it as: EmployeeID↠ProjectName
Candidate Key Dependency
Candidate Key Dependency: It's like saying, "An attribute depends on the whole candidate key, not just part of it."
Example: Consider a table with information about students and their courses:
StudentID CourseID Grade In symbols, we write it as: {StudentID, CourseID}→Grade
1 Math101 A So, knowing both the StudentID and the CourseID uniquely determines
the Grade for a particular row in the table.
1 Phys102 B Candidate key dependencies are important in database design to ensure
2 Math101 C that each piece of information is uniquely identified by the entire
3 Chem201 A candidate key, avoiding any ambiguity or confusion.

Now, let's talk about a candidate key - a unique identifier for each row.
In this case, let's consider the combination of StudentID and CourseID
as a candidate key.
A candidate key dependency would mean that an attribute depends on
the entire candidate key, not just part of it.

For example, let's say we have a candidate key dependency: {StudentID,


CourseID} -> Grade.
This implies that the Grade depends on both the StudentID and the
CourseID together, not just on StudentID or CourseID separately.
Full Functional Dependency
Full Functional Dependency: It's like saying, "An attribute depends on the whole set of determining attributes, and not on a part of it."
Example: Consider a table with information about employees and their projects:

EmployeeID ProjectID ProjectName


1 101 ProjectA
So, knowing both the EmployeeID and the ProjectID uniquely
1 102 ProjectB determines the ProjectName for a particular row in the table.
2 101 ProjectA Full functional dependencies help ensure that the attribute (ProjectName
in this case) depends on the entire set of determining attributes
3 103 ProjectC (EmployeeID and ProjectID together), minimizing redundancy and
maintaining a well-organized database.

Now, let's talk about a full functional dependency. Let's say we have a
full functional dependency: {EmployeeID, ProjectID} -> ProjectName.

This means that the ProjectName depends on both the EmployeeID and
the ProjectID together, not just on EmployeeID or ProjectID separately.

In symbols, we write it as: {EmployeeID, ProjectID}→ProjectName


Types of Functional Dependency
Trivial Functional Dependency: Any relation between A and B is said to be a Trivial

functional dependency if B is held by A and B is a subset of A.

Relations like A is held by A(A → A) and B is held by B(B → B)

For example, In the above table {student_id, student_name} → student_id can be

considered as a trivial functional dependency. This is because student_id is a subset of

{student_id, student_name}. Similarly, student_id → student_id and student_name →

student_name are said to be in trivial functional dependancy.


• Non-Trivial Functional Dependency: Any relation between A and B

is said to be a Non-Trivial Functional dependency if B is held by A

and B is not a subset of A.

• For example, student_id → student_name and student_name →

student_dob.
• Inference Rules
• The fundamental axioms of Armstrong provide the basis for inference
rules.
• The Functional dependencies that are present in a relational database
are deduced using Armstrong's axioms.
• Inference rule can be taken to be as a kind of assertion.
• It can be used to derive additional functional dependencies from a set
of FDs.
• It can also be used to infer many functional dependencies in addition
to the ones already present, from the initial set.
• There are 6 inference rules present for functional dependency
IR1 - reflexive rule: according to this rule, if b(a set of attributes) is a subset of a(another set of attributes), then B is
held by A.
If a ⊇ b, then a → b

Table for Set A: {Animals} Table for Set B: {Domestic Animals}

A B
Dog Dog
Cat Cat
Elephant *****
Lion
*****

set A represents all animals, and set B represents domestic animals.


Since every element in set B (Domestic Animals) is also an element of set A (Animals),
we can say that A ⊇ B.
According to the reflexive rule, this implies A → B.
If an animal is in set A (Animals), it implies that it is also in set B (Domestic Animals).
IR2 - Augmentation Rule: According to this rule, B is held by A. Then BC is held by AC for any set of attributes
C. This is also knows as the partial dependency rule.
a new set of attributes C, representing the subject:
If A → B, then AC → BC
Student Name Subject Grade
ID
Suppose we have two sets of attributes:
•Set A: {Student ID, Name} 001 Alice Math A
•Set B: {Grade} 002 Bob English B
003 Charlie Science C
we have a functional dependency: A → B, which means that the
Grade (B) is dependent on the Student ID and Name (A). ... ... ... ...

Example:
Student ID Name Grade Set C: {Subject}

According to the Augmentation Rule (Partial Dependency


001 Alice A Rule), if A → B, then AC → BC. This means that if the Grade
002 Bob B (B) depends on the Student ID and Name (A), then it will also
003 Charlie C depend on the Student ID, Name, and Subject (AC → BC).
... ... ... Grade (BC) depends not only on Student ID and Name (AC) but
also on the Subject. This illustrates the Augmentation Rule in
the Grade (B) depends on both Student ID and Name (A). action.
Transitive Rule (IR3)
In the transitive rule, if X determines Y and Y determines Z,
then X must also determine Z.

if X determines Y and Y determines Z, then X also determines Z.

1. If X → Y and Y → Z then X → Z
Union Rule (IR4)
This rule is also known as the additive rule. if X determines Y
and X determines Z, then X also determines both Y and Z.

Union rule says, if X determines Y and X determines Z, then X


must also determine Y and Z.

1. If X → Y and X → Z then X → YZ

Proof:

1. X → Y (given)
2. X → Z (given)
3. X → XY (using IR2 on 1 by augmentation with X. Where XX =
X)
4. XY → YZ (using IR2 on 2 by augmentation with Y)
5. X → YZ (using IR3 on 3 and 4)
Decomposition Rule (IR5)
This rule is the reverse of the Union rule and is also known as
the project rule.

if X determines Y and Z together, then X determines Y and Z


separately

1. If X → YZ then X → Y and X → Z

Proof:

1. X → YZ (given)
2. YZ → Y (using IR1 Rule)
3. X → Y (using IR3 on 1 and 2)
Pseudo transitive Rule (IR6)
In the pseudo transitive rule, if X determines Y, and YZ
determines W, then XZ also determines W.

1. If X → Y and YZ → W then XZ → W

Proof:

1. X → Y (given)
2. WY → Z (given)
3. WX → WY (using IR2 on 1 by augmenting with W)
4. WX → Z (using IR3 on 3 and 2)
Anomalies
• What are the Anomalies in DBMS?
• Normalization is required to organise data in a database. If it is not
done, the overall data integrity in the database will deteriorate over
time. This is related to data abnormalities in particular. These DBMS
anomalies are common, and they result in data that doesn’t match with
what the real-world database claims to reflect.

• When there is too much redundancy in the information present in the


database, anomalies occur. Also, when all the tables that make up a
database are poorly constructed, anomalies are bound to occur.
How are Anomalies Caused in DBMS?
What exactly does “bad construction” imply? When the DB (database) designer constructs the database, he should
identify the entities that rely on one other for existence, such as hotel rooms and the hotel, and then reduce the
probability that one might ever exist independently of the other.

A database anomaly is a fault in a database that usually emerges as a result of shoddy planning and storing
everything in a flat database. In most cases, this is removed through the normalization procedure, which involves
the joining and splitting of tables. The purpose of the normalization process is to minimise the negative impacts of
generating tables that would generate anomalies in the DB.
Example
Consider a manufacturing firm that keeps worker information in a table called employee, which has four columns:
w_id for the employee’s id, w_name for the employee’s name, w_address for the employee’s address, and w_dept for
the employee’s department. The table will look like this at some point:

The table above has not been normalized. We’ll look at the issues that arise when the table isn’t normalized.
Type of Anomalies in DBMS
Various types of anomalies can occur in a DB. For instance, redundancy anomalies are a very significant issue for tests if you’re a student, and for job interviews if
you’re searching for a job. But these can be easily identified and fixed. The following are actually the ones about which we should be worried:

1. Update

2. Insert

3. Delete

Anomalies in databases can be, thus, divided into three major categories:

Update Anomaly

Employee David has two rows in the table given above since he works in two different departments. If we want to change David’s address, we must do so in two rows,
else the data would become inconsistent.

If the proper address is updated in one of the departments but not in another, David will have two different addresses in the database, which is incorrect and leads to
inconsistent data.

Insert Anomaly
If a new worker joins the firm and is currently unassigned to any department, we will be unable to put the data into the table because the w_dept field does not allow
nulls.

Delete Anomaly
If the corporation closes the department F890 at some point in the future, deleting the rows with w_dept as F890 will also erase the information of employee Mike,
who is solely assigned to this department.
What is Normalization?

Normalization is like tidying up this information to make it more


organized and efficient. It's a process in database design that helps
prevent problems and ensures data is well-structured.
Why Do We Need Normalization?
• Avoiding Redundancy: In the table above, the student's name is repeated, which can waste space
and lead to inconsistencies. Normalization helps eliminate unnecessary repetition.

• Minimizing Errors: Redundancy can cause errors. For example, if Alice's name is misspelled in
one row, it should be corrected in every row. Normalization reduces the chance of these errors.

• Improving Efficiency: By organizing data smartly, normalization makes it easier to update, insert,
and delete information without introducing problems.
Goals of Normalization:
• Eliminate Redundancy: Store each piece of information in one place, avoiding unnecessary
repetition.

• Minimize Data Anomalies: Reduce the risk of errors, like conflicting information or incomplete
updates.

• Improve Data Integrity: Ensure that relationships between pieces of information are well-defined
and consistent.
Data Anomalies
• Data Anomalies: Data anomalies are problems or irregularities that can occur in a database when
it is not properly designed or organized.

• These anomalies can lead to inconsistencies, errors, and difficulties in managing and retrieving
information.

• There are three main types of data anomalies: insertion anomalies, update anomalies, and deletion
anomalies.
Insertion Anomalies:
•What: These occur when you try to add new information to the database, but you can't because of incomplete data.
•Example: In a table tracking students and their courses, if you can't add a new course without assigning it to a student,
you have an insertion anomaly.

StudentsCourses Table:

StudentID StudentName CourseID CourseName Grade


1 Alice 101 Math101 A
1 Alice 102 Physics102 B
2 Bob 101 Math101 C
3 Charlie 103 Chem103 A

Now, let's say you want to add a new course to the database, but there's a requirement that every course must be
assigned to a student. This could lead to an insertion anomaly because you cannot add a new course without assigning it
to a student.
Update Anomalies:
•What: These happen when updating information in one place, but not updating it everywhere it needs to be updated.
•Example: If a student changes their name, and you update it in one record but forget to update it in all the records, you have an update
anomaly.

StudentsCourses Table:

StudentID StudentName CourseID CourseName Grade


1 Alice 101 Math101 A
1 Alice 102 Physics102 B
2 Bob 101 Math101 C
3 Charlie 103 Chem103 A

suppose Alice decides to change her name from "Alice" to "Alicia." An update to one record may be straightforward:

However, if you forget to update all occurrences of Alice's name, you would have an update anomaly:
Deletion Anomalies:
•What: These occur when deleting information unintentionally removes other related information.
•Example: If removing a course deletes information about the instructor who teaches that course, you have a deletion anomaly.

StudentsCourses Table:

StudentID StudentName CourseID CourseName Grade


1 Alice 101 Math101 A
1 Alice 102 Physics102 B
2 Bob 101 Math101 C
3 Charlie 103 Chem103 A

you want to remove a course from the table. However, if deleting a course also removes information about the students enrolled in that
course, you have a deletion anomaly.
Normalization is a database design technique that involves organizing tables and relationships in a relational database to reduce redundancy and
improve data integrity. There are several normal forms, each building on the previous one. The most commonly discussed normal forms are:
1.First Normal Form (1NF):
1. Ensures that the values in each column of a table are atomic (indivisible) and that there are no repeating groups or arrays. It deals with
basic structure issues.
2.Second Normal Form (2NF):
1. Builds on 1NF and eliminates partial dependencies. A table is in 2NF if it's in 1NF, and no non-prime attribute is dependent on only a part
of any candidate key.
3.Third Normal Form (3NF):
1. Extends 2NF by removing transitive dependencies. A table is in 3NF if it's in 2NF, and no transitive dependencies exist (non-prime
attributes are not dependent on other non-prime attributes).
4.Boyce-Codd Normal Form (BCNF):
1. A more advanced form that addresses certain anomalies not covered by 3NF. A table is in BCNF if, for every non-trivial functional
dependency, the determinant is a superkey.
5.Fourth Normal Form (4NF):
1. Focuses on multi-valued dependencies. A table is in 4NF if it's in BCNF, and multi-valued dependencies are removed.
6.Fifth Normal Form (5NF):
1. Addresses cases where there are join dependencies. A table is in 5NF if it's in 4NF, and join dependencies are removed.
1st NORMAL FORM
• Definition:
• A relation (or table) is considered to be in 1NF if it satisfies the
following conditions:
• Every attribute (column) in the relation must be a single-valued
attribute.
• The attribute domain (the set of possible values for an attribute)
remains consistent.
• Each attribute has a unique name within the relation.
• The order in which data is stored does not impact the validity of the
relation.
• Why Is 1NF Important?
• Ensures data integrity: By eliminating redundancy and ensuring
atomic values, 1NF facilitates data processing.
• Prevents insertion, deletion, and update anomalies: These anomalies
occur when data is not properly normalized.
• Example:
• Let’s consider a relation called STUDENT:
• It contains attributes like ID, Name, and Courses.
• The Courses attribute is multi-valued, violating
1NF. Decomposed Relation (in 1NF):

• To bring it into 1NF, we decompose it:


• Original Relation: ID | Name | Course
1 |A | c1
ID | Name | Courses 1 |A | c2
1 |A | c1, c2 2 |E | c3
2 |E | c3 3 |M | c2
3 |M | c3
3 |M | c2, c3
2ND NORMAL FORM
1.What Is 2NF?
1. 2NF is a crucial step in database normalization. It helps organize data
efficiently and reduces redundancy.
2. By adhering to 2NF principles, you create more resilient and well-structured
database tables.
2.Why Do We Need 2NF?
1. To eliminate partial dependencies and further reduce data redundancy.
2. Ensures that every non-key attribute is fully dependent on the primary key.
1.Key Points about 2NF:
1. A table is in 2NF if it meets the following criteria:
1. It’s already in First Normal Form (1NF) (which ensures no repeating groups).

2. Every non-key attribute depends entirely on the primary key.

3. Composite keys (primary keys with multiple attributes) often require 2NF.
Example:
Imagine a table with student information:
STUDENT_NO | COURSE_NO | COURSE_FEE
1 | C1 | 1000
2 | C2 | 1500
1 | C4 | 2000
4 | C3 | 1000
4 | C1 | 1000

Here, COURSE_FEE depends on both COURSE_NO and STUDENT_NO.


To achieve 2NF, we split the table into two related tables:
Table 1 (STUDENT_COURSES): Table 2 (COURSE_FEES):

STUDENT_NO | COURSE_NO COURSE_NO | COURSE_FEE


1 | C1 C1 | 1000
2 | C2 C2 | 1500
1 | C4 C3 | 1000
4 | C3 C4 | 2000
4 | C1
• What is the Third Normal Form in DBMS?

• A given relation is said to be in its third normal form when it’s in 2NF but has no transitive
partial dependency. Meaning, when no transitive dependency exists for the attributes that are
non-prime, then the relation can be said to be in 3NF.

• In simpler words,

• In a relation that is in 1NF or 2NF, when none of the non-primary key attributes transitively
depend on their primary keys, then we can say that the relation is in the third normal form of
3NF.
• Rules Followed in 3rd Normal Form in DBMS
• We can say that a relation is in the third normal form when it holds
any of these given conditions in case of a functional dependency P ->
Q that is non-trivial:
• P acts as a super key.
• Q acts as a non-prime attribute. Meaning, every element of Q forms a
part of a candidate key.
• Example:
• Consider a student table:
STUDENT_NO | STUD_NAME | STUD_STATE | STUD_COUNTRY | STUD_AGE
1 | Alice | CA | USA | 20
2 | Bob | NY | USA | 22
3 | Carol | TX | USA | 21

• Here, STUD_COUNTRY depends indirectly on STUD_NO via STUD_STATE.


• To achieve 3NF, we decompose the table:
• STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_AGE)
• STATE_COUNTRY (STATE, COUNTRY)
• What is BCNF in DBMS?

• BCNF (Boyce Codd Normal Form) is an advanced version of the third normal
form (3NF), and often, it is also known as the 3.5 Normal Form. 3NF doesn't
remove 100% redundancy in the cases where for a functional
dependency (say, A->B), A is not the candidate key of the table. To deal with
such situations, BCNF was introduced.

• BCNF is based on functional dependencies, and all the candidate keys of the
relation are taken into consideration. BCNF is stricter than 3NF and has some
additional constraints along with the general definition of 3NF.
• A table or relation is said to be in BCNF in DBMS if the table or the
relation is already in 3NF, and also, for every functional dependency
(let's say, X->Y), X is either the super key or the candidate key. In
simple terms, for any case (let's say, X->Y), X can't be a non-prime
attribute.
• Rules for BCNF in DBMS
• A table or relation is said to be in BCNF (Boyce Codd Normal Form)
if it satisfies the following two conditions that we have already studied
in its definition:
• It should satisfy all the conditions of the Third Normal Form (3NF).
• For any functional dependency (A->B), A should be either the super
key or the candidate key. In simple words, it means that A can't be
a non-prime attribute if B is given as a prime attribute.
Example:
Let’s consider a student database:
Stu_ID | Stu_Branch | Stu_Course | Branch_Number | Stu_Course_No
101 | CS&E | DBMS | B_001 | 201
101 | CS&E | Comp Net | B_001 | 202
102 | ECE | VLSI Tech | B_003 | 401
102 | ECE | Mobile Comm| B_003 | 402

Functional Dependencies:
Stu_ID → Stu_Branch
Stu_Course → {Branch_Number, Stu_Course_No}
Candidate Keys: {Stu_ID, Stu_Course}
4th Normal Form
• A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued dependency.
• For a dependency A → B, if for a single value of A, multiple values of B exists, then the relation
will be a multi-valued dependency.

Example
STUDENT

STU_ID COURSE HOBBY

21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey
The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent entity.

Hence, there is no relationship between COURSE and HOBBY.

In the STUDENT relation, a student with STU_ID, 21 contains two courses, Computer and Math and
two hobbies, Dancing and Singing.

So there is a Multi-valued dependency on STU_ID, which leads to unnecessary repetition of data.

So to make the above table into 4NF, we can decompose it into two tables:
5NF
• A relation is in 5NF if it is in 4NF and not contains any join dependency
and joining should be lossless.
• 5NF is satisfied when all the tables are broken into as many tables as
possible in order to avoid redundancy.
• 5NF is also known as Project-join normal form (PJ/NF).
In the above table, John takes both Computer and Math class for Semester 1 but he doesn't take Math class
for Semester 2. In this case, combination of all these fields required to identify a valid data.

Suppose we add a new Semester as Semester 3 but do not know about the subject and who will be taking
that subject so we leave Lecturer and Subject as NULL. But all three columns together acts as a primary key,
so we can't leave other two columns blank.

So to make the above table into 5NF, we can decompose it into three relations P1, P2 & P3:
P1
CLOSURE SET OF ATTRIBUTES
• The closure of a set of attributes in a database is all the other attributes that you can
figure out based on the given set of attributes.
• You use rules called functional dependencies to determine these additional
attributes.
• This helps in understanding how different attributes are related to each other in a
database table.
• Finding the closure of a set of attributes is needed for problems related to
NORMALIZATION.
• For example, you need to know how to compute the closure of a set of attributes to
check if a set of attributes is a candidate key or a superkey.
• You also need this algorithm to decompose non-normal tables into NORMAL
FORMS.
• The closure of a set of attributes X is the set of those attributes that can be
functionally determined from X. The closure of X is denoted as X+.

• When given a closure problem, you’ll have a set of functional dependencies


over which to compute the closure and the set X for which to find the
closure. A FUNCTIONAL DEPENDENCY A1, A2, …, An -> B in a table holds if
two records that have the same value of attributes A1, A2, …, An also have
the same value for attribute B.

• The closure of X is the set of all attributes such that two records that have
the same value of X also have the same value of X+.
• Steps to Find Closure of an Attribute Set
• Following steps are followed to find the closure of an attribute set
• Step-01: Add the attributes contained in the attribute set for which
closure is being calculated to the result set.
• Step-02: Recursively add the attributes to the result set which can be
functionally determined from the attributes already contained in the
result set.

You might also like