Digital Assignment 1: Normalization
Digital Assignment 1: Normalization
Submitted by:
18BCE2436 Prashant Kumar Jha
18BCE2414 Saugat Malla
18BCE0100 Chandrasekhar
Normalization
Normalization is used to generate a set of relation schemas that allows us to store
information without unnecessary redundancy, yet also allows us to retrieve
information easily. The approach is to design schemas that are in an appropriate
normal form. To determine whether a relation schema is in one of
the desirable normal forms, we need additional information about the real-world
enterprise that we are modeling with the database. The most common approach
is to use functional dependencies.
Characteristics of Normalization:
o Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set of
relations. It is also used to eliminate the undesirable characteristics like
Insertion, Update and Deletion Anomalies.
o Normalization divides the larger table into the smaller table and links them
using relationship.
o The normal form is used to reduce redundancy from the database table.
Functional Dependencies:
Functional dependency is a relationship that exists when one attribute uniquely determines another
attribute.
If R is a relation with attributes X and Y, a functional dependency between the attributes is represented as
X->Y, which specifies Y is functionally dependent on X. Here X is a determinant set and Y is a
dependent attribute. Each value of X is associated with precisely one Y value.
Functional dependency in a database serves as a constraint between two sets of attributes. Defining
functional dependency is an important part of relational database design and contributes to aspect
normalization.
Types of Normal Forms
1NF:
A relation is in 1NF if it contains an atomic value. As per the rule of the first
normal form, an attribute (column) of a table cannot have multiple values. It should
hold only atomic values.
Example:
Emp_id Emp_name Emp_address Emp_mobile
As we can see from the table the two employees are having two number due to
which it is not atomic, therefore it is not in 1NF.
2NF:
A relation will be in 2Nf if it is in 1NF and all non-key attributes are fully
functional dependent on the primary key.
A table is in 2NF if the following conditions hold:
Table should be in 1NF
No non-prime attribute is dependent on the proper subset of any candidate
key of the table.
Example:
1 Physics 23
1 Maths 23
3 Chemistry 30
The above table is in 1NF because each attribute has atomic values but it is not in
2NF because non prime attributes teacher_age is dependent on teachers id alone
which is a proper subset of candidate key because of which the rule of 2NF is
violated.
So to make the table comply with 2NF we can break it in two table:
T_id Teacher_Age
1 23
3 30
T_id subject
1 Physics
1 Maths
3 Chemistry
Now the above two tables comply with second normal form.
3NF:
A relation will be in 3NF if it is in 2NF and no transition dependency exists.
A table is said to be in 3NF if both the following conditions hold:
Table must be in 2NF
Transitive functional dependency of non-prime attributes on any super key
must be removed.
In the table above Emp_state and Emp_city dependent on Emp_zip. And, Emp_zip
is dependent on Emp_id that makes non-prime attributes (Emp_state, Emp_city &
Emp_district) transitively dependent on super key (Emp_id). This violates the rule
of 3NF.
So to make the above table comply with 3NF we break the table into two tables:
Employee table:
1 John 12444
2 Tom 13434
3 Bob 12412
4 Rob 12435
Emplyee_zip table:
12444 UP Agra
13434 TN Chennai
12412 UK Chennai
12435 MP Gwalior
Multi-Valued Dependency
Definition:
A multivalued dependency (MVD) on R, X ->-> Y , says that if two tuples
of R agree on all the attributes of X, then their components in Y may be
swapped, and the result will be two tuples that are also in the relation that is,
for each value of X, the values of Y are independent of the values of R-X-Y.
If all these conditions are true for any relation(table), it is said to have multi-valued
dependency.
Also, if a table has attributes P, Q and R, then Q and R are multi-valued facts of P.
It is represented by double arrow:
P->->Q
Q->->R
General Example:
Drinkers(name, addr, phones, beersLiked)
• Thus, each of the drinker’s phones appears with each of the beers they like in all
combinations.
– If a drinker has 3 phones and likes 10 beers, then the drinker has 30 tuples
Example: A car manufacturer company that produces cars of different colors every year.
C1 2009 Blue
C2 2010 Yellow
C3 2015 Orange
C4 2014 Green
C5 2019 Blue
In the above example, Manufacturing_year and color are independent of each other
but depent on Car_model. In the above example the two columns are said to be
multivalue dependent on Car_model.
Fourth Normal Form comes into picture when Multi-valued Dependency occur in
any relation. In this tutorial we will learn about Multi-valued Dependency, how to
remove it and how to make any table satisfy the fourth normal form.
1 Computer Hockey
2 Math Football
2 Physics Singing
4 Chemistry Cricket
The above table is in 3NF, but the course and hobby are two independent entity
due to which there is no relationship between course and hobby.
In the above table a student with STU_ID 2 contains two course and two hobbies
due to which there is multivalued dependency on STU_ID, which leads to
unnecessary repetition of data.
To make the above table into 4NF, we can decompose it into two tables:
Student_Course
STU_ID Course
1 Computer
2 Maths
2 Physics
4 Chemistry
STUDENT_HOBBY
STU_ID HOBBY
1 Hockey
2 Football
2 Singing
4 Cricket
Example:
Drinkers(name,addr,phones,beersLiked)
FD: name->adr
MVD: name->->phones
Name->->beersLiked
- Key is
-{name,phones,beersLiked}
Since for this example all dependencies violate 4NF we can decompose as follows:
1. Drinkers1(name,addr)
-In 4NF, only dependency is name->addr.
2. Drinkers2(name,phones,beersLiked)
- MVD: name->->phones and name->->beersLiked
-All the three attributes form the key