Normalization and Denormalization
Normalization and Denormalization
It is a relationship that typically exists between two attributes such that with the
help of one attribute we can get the values of another attribute. The attribute
that is used for finding the values of other attributes is called the primary key
attribute.
Easy to maintain
We have a table with student details such as roll number, name, and city.
1 Yash Delhi
2 Kartik Mumbai
3 Aditya Delhi
4 Kartik Pune
Here roll_no is only unique attribute. So the primary key for the given table will
be roll_no. Other attributes such as name and city are dependent on the
roll_no i.e. on the basis of roll_no we can get student’s name and its city.But
we can not get roll_no of student based on it’s name or city as it will create
ambiguity.
For example, if we take name as kartik there will be 2 records with the name
kartik which will result into ambiguity. Also if we take city as Delhi there will be
Types of Dependencies
Partial Dependency
Full Dependency
Transitive Dependency
Partial Dependency
If the value of a non-primary attribute can be defined using part of the primary
primary key is formed using more than one attribute.This type of key also
In below given example, the primary key is formed using roll_no + sub_id
and one of the non-primary attribute can is dependent on part of the primary
Example
Let’s take an example, we have a table where we have columns of student roll
1 121 Science 80
1 131 Math 65
2 131 Math 95
2 141 English 75
Here primary key will be roll_no+ sub_id because multiple roll_no can have
the same sub_id and the same roll_no can have multiple sub_id.In the given
example, roll_no 1 has two sub_id i.e. 121 and 131 where as sub_id 131 has
two roll_no 1 and 2.So here primary key will be roll_no + sub_id.
But we do have another column of sub_name and the value of sub_name can
example, sub_id = 131 will have the sub_name = ‘math’ here we required only
Full Dependency
If all attributes of the primary key are required for the identifying value of a non-
When all non-primary attribute are dependent on whole primary key and they
cannot be get defined using only partial part of primary key then it is called as
Full Dependency.
Deppendency also.
Example
Let’s take an example, we have a table where we have columns of student roll
Table
roll_no sub_id marks
1 121 80
1 131 65
2 131 95
2 141 75
Here the primary key is roll_no+ sub_id. If we want a mark of any student, we
require both roll_no and sub_id. We cannot obtain marks based on one
If we want to know the marks of sub_id=131 there will be two records and
ambiguity will be created. If we take roll_id=1 there will be two records with the
same roll number and ambiguity will be created here.This ambiguity will be
solved using full attributes of the primary key i.e. roll_no + sub_id. So we
Transitive Dependency
When any attribute does not require primary key and can easily get value
Example
Let’s take an example, we have a table where we have columns of student roll
Here the primary key is roll_no but we can identify the city using zip-code
attribute.For example, roll-no = 1 has city=pune and city=pune will have zip-
present in database.
To avoid transitive dependency, a new table should be created using the non
prime attributes which have relation with each other .New table should have its
primary key and a refernce to the previous table via foreign key.
the table and kept in a new table with new relation where non prime attribute
What Is Normalization?
Normalization is the process of organizing (decomposing) the data in a relational
database in accordance with a series of normal forms in order to reduce data
redundancy, improve data integrity and to remove Insert, Update and Delete
Anomalies.
By normalizing a database, you arrange the data into tables and columns. You ensure
that each table contains only related data. If data is not directly related, you create a
new table for that data. Normalization is an important part of relational database design
for many reasons, but mainly because it allows database to take up as little disk space
as possible, resulting in increased speed, accuracy and efficiency of the database.
1NF
2NF
3NF
BCNF
4NF
5NF
What Is Denormalization?
Denormalization is the process where data from multiple tables are combined into a
single table, so that data retrieval will be faster. Denormalization is a strategy that
database managers use to increase the performance of a database infrastructure.
Advantages of Normalization
Users can extent the database without necessarily impacting the exiting data.
Minimizes null values
Helps to reduce or avoid modification problems
Searching, sorting and creating indexes can be faster since tables are
narrower and more rows fit on a data page.
Simplifies queries
Makes database smaller by eliminating redundant data.
It delete anomalies that will cause an error in the system.
Does not waste storage space
It results in a more compact database (due to less data redundancy)
It results in database being simpler and easier to understand.
Disadvantages of Normalization
Normalization is a very difficult task because it requires detailed analysis and
design of the database.
Normalization creates a tedious task, because there are more tables to join
Normalization results in results in slow performance of the entire database
system because tables contain codes rather than real data.
It makes the query more difficult, because it consists of an SQL that is
constructed dynamically and is usually constructed by desktop friendly query
tools, hence it is hard to model the database without knowing what the
customers desires.
A poorly normalized database may perform badly and store data inefficiently.
Normalizing a database is sometimes complex because analysts have to
understand the purpose of the database such as whether it should be
optimized for writing data, reading data or both.
Advantages Of Denormalization
It reduces the number of foreign keys and indexes. This helps to save on data
manipulation time and memory as well.
It minimizes the number of necessary join queries
In some cases, it reduces number of tables in the database.
Improves performance of the database by increasing speed
Disadvantages Of Denormalization
Denormalization usually speeds retrieval but can slow updates.
Denormalization is always application specific and therefore requires to be
evaluated if the application changes.
Denormalization can increase the size of tables.
Denormalization can make update and insert code difficult to write.
Data redundancy necessitates more storage.
Denormalization does not maintain any data integrity.
Waste storage space.