Normalization
Normalization
Esha Tripathi
Normalization
• Database normalization is the process of organizing the attributes of
the database to reduce or eliminate data redundancy (having the
same data but at different places) .
• Problems occur during data redundancy
1. Data redundancy unnecessarily increases the size of the database
as the same data is repeated in many places.
2. Inconsistency problems also arise during insert, delete and update
operations.
Esha Tripathi
Two types of duplicacy occurs
• Row-Level
• Column Level SID SNAME CID CNAME FID FNAME SALARY
1 Ram C1 DBMS F1 JOHN 30000
2 Ravi C2 JAVA F2 BOB 40000
SID SNAME AGE 3 Raj C1 DBMS F1 JOHN 30000
4 Amit C1 DBMS F1 JOHN 30000
1 Raj 20
…. ….. ….. ….. …. …. …..
2 Arun 18
1 Raj 20
Esha Tripathi
• Row-level duplicacy is removed by making primary key in a table.
• Column level duplicacy generates anomalies:
- Insertion Anomaly
- Deletion Anomaly
- Updation Anomaly
Anomaly means problem occur at particular case.
Esha Tripathi
• Insertion anomalies − We tried to insert data in a record that does not
exist at all. Normalization is a method to remove all these anomalies
and bring the database to a consistent state.
• Deletion anomalies − We tried to delete a record, but parts of it was
left undeleted because of unawareness, the data is also saved
somewhere else also.
• Update anomalies − If data items are scattered and are not linked to
each other properly, then it could lead to strange situations. For
example, when we try to update one data item having its copies
scattered over several places, a few instances get updated properly
while a few others are left with old values. Such instances leave the
database in an inconsistent state.
Esha Tripathi
Insert Anomaly
• Insert into Student
values(‘C3’,’DS’) ----Error shown –not enough values
This query is not working because insert query pass values into all the
attributes.
Let a university started a new course C3 named as DS. We can’t not
insert the full data as student will be register later in this course.
SID SNAME CID CNAME FID FNAME SALARY
1 Ram C1 DBMS F1 JOHN 30000
2 Ravi C2 JAVA F2 BOB 40000
3 Raj C1 DBMS F1 JOHN 30000
4 Amit C1 DBMS F1 JOHN 30000
…. ….. ….. ….. …. …. …..
C3 EshaDS
Tripathi
Deletion Anomaly
• Delete from student
where SID = 2
After Executing this query, student data having SID=2 is deleted. Now if
I want to fetch the data related to F2.We can not fetch bcs with the
student F2 data is also deleted and we can not recover deleted data.
Esha Tripathi
• These anomaly are resolved when we divide this whole table into
three tables.
Esha Tripathi
Functional Dependency
• In a relational database management, functional dependency is
a concept that specifies the relationship between two sets of
attributes where one attribute determines the value of another
attribute. It is denoted as X → Y, where the attribute set on the
left side of the arrow, X is called Determinant, and Y is called
the Dependent.
Esha Tripathi
roll_no name dept_name dept_building
42 abc CO A4
43 pqr IT A3
44 xyz CO A4
45 xyz IT A3
46 mno EC B2
47 jkl ME B2
From the above table we can conclude some valid functional dependencies:
1. roll_no → { name, dept_name, dept_building },
Here, roll_no can determine values of fields name, dept_name and dept_building, hence a valid Functional
dependency
2. roll_no → dept_name , Since, roll_no can determine whole set of {name, dept_name, dept_building}, it can
determine its subset dept_name also.
3.dept_name → dept_building , Dept_name can identify the dept_building accurately, since departments with
different dept_name will also have a different dept_building
More valid functional dependencies: roll_no → name, {roll_no, name} ⇢ {dept_name, dept_building}, etc.
Esha Tripathi
Here are some invalid functional dependencies:
1. name → dept_name Students with the same name can have different dept_name, hence this
is not a valid functional dependency.
1.Transitivity: If X → Y and Y → Z are both valid dependencies, then X→Z is also valid by the
Transitivity rule.
Example, roll_no → dept_name & dept_name → dept_building, then roll_no → dept_building is
also valid.
Esha Tripathi
Trivial Functional Dependency
• In Trivial Functional Dependency, a dependent is always a
subset of the determinant. i.e. If X → Y and Y is the subset of
X, then it is called trivial functional dependency
• Example: roll_no name age
42 abc 17
43 pqr 18
44 xyz 18
Here, {roll_no, name} → name is a trivial functional dependency, since
the dependent name is a subset of determinant set {roll_no,
name}. Similarly, roll_no → roll_no is also an example of trivial
functional dependency.
Esha Tripathi
2. Non-trivial Functional Dependency
• In Non-trivial functional dependency, the dependent is strictly
not a subset of the determinant. i.e. If X → Y and Y is not a
subset of X, then it is called Non-trivial functional dependency.
• Example:
roll_no name age
42 abc 17
43 pqr 18
44 xyz 18
Here, roll_no → name is a non-trivial functional dependency, since the dependent name is not a
subset of determinant roll_no. Similarly, {roll_no, name} → age is also a non-trivial functional
dependency, since age is not a subset of {roll_no, name}
Esha Tripathi
3. Multivalued Functional Dependency
• In Multivalued functional dependency, entities of the
dependent set are not dependent on each other. i.e. If a → {b,
c} and there exists no functional dependency between b and
c, then it is called a multivalued functional dependency.
• Example: roll_no name age
42 abc 17
43 pqr 18
44 xyz 18
45 abc 19
43 pqr EC 2
44 xyz IT 1
45 abc EC 2
Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of
transitivity, enrol_no → building_no is a valid functional dependency. This is an indirect
functional dependency, hence called Transitive functional dependency.
Esha Tripathi
5. Fully Functional Dependency
• In full functional dependency an attribute or a set of attributes
uniquely determines another attribute or set of attributes. If a
relation R has attributes X, Y, Z with the dependencies X->Y
and X->Z which states that those dependencies are fully
functional.
Esha Tripathi
Advantages of Functional Dependencies
1. Data Normalization
2. Query Optimization
3. Consistency of Data
4. Data Quality Improvement
Esha Tripathi
Levels of Normalization
• Levels of normalization based on the amount of
redundancy in the database.
• Various levels of normalization are:
• First Normal Form (1NF)
• Second Normal Form (2NF)
Third Normal Form (3NF)
Redundancy
Number of Tables
•
Boyce-Codd Normal Form (BCNF)
Complexity
•
• Fourth Normal Form (4NF)
• Fifth Normal Form (5NF)
• Domain Key Normal Form (DKNF)
Esha Tripathi