Week 2
Week 2
n
Meaning
Normalization is the process of
organizing data within a database (
relational database) to eliminate
data anomalies, such as redundancy.
In simpler terms, it involves
breaking down a large, complex
table into smaller and simpler tables
while maintaining data relationships.
Normalization is commonly used
when dealing with large datasets.
Ifa dataset is maintained in the form of
just a single table, it leads to Data
Redundancy, which means a single
value of data is stored multiple times.
This leads to many issues like an
increment in the database size, slower
data retrieval, and data inconsistency.
Thus, to overcome this, Normalization in
DBMS is used in which a large table is
reduced into smaller tables until each of
the single tables contains one relation.
Normalization in DBMS is a technique using
which you can organize the data in the
database tables so that:
◦ There is less repetition of data,
◦ A large set of data is structured into a bunch of
smaller tables, and the tables have a proper
relationship between them.
◦ DBMS Normalization is a systematic approach
to decompose (break down) tables to
eliminate data redundancy(repetition) and
undesirable characteristics like Insertion
anomaly in DBMS, Update anomaly in DBMS, and
Delete anomaly in DBMS.
It is a multi-step process that puts data
into tabular form, removes duplicate data,
and set up the relationship between tables.
Why we need Normalization in
DBMS?
Normalization is required for,
Eliminating redundant(useless) data, therefore
handling data integrity, because if data is
repeated it increases the chances of inconsistent
data.
Normalization helps in keeping data consistent by
storing the data in one table and referencing it
everywhere else.
Storage optimization although that is not an issue
these days because Database storage is cheap.
Breaking down large tables into smaller tables with
relationships, so it makes the database structure
more scalable and adaptable.
Ensuring data dependencies make sense i.e. data is
logically stored.
Problems without Normalization
in DBMS
If a table is not properly
normalized and has data
redundancy(repetition) then it
will not only eat up extra
memory space but will also
make it difficult for you to handle
and update the data in the
database, without losing data.
Insertion, Updation, and Deletion
Anomalies are very frequent if
the database is not normalized.
To understand these anomalies let us
take an example of a Student table.
rollno name branc hod office_t
h el
401 Akon CSE Mr. X 53337
402 Bkon CSE Mr. X 53337
403 Ckon CSE Mr. X 53337
404 Dkon CSE Mr. X 53337
Our table already satisfies 3 rules out of the 4 rules, as all our
column names are unique, we have stored data in the order we
wanted to and we have not inter-mixed different type of data in
columns.
But out of the 3 different students in our table, 2 have opted for
more than 1 subject. And we have stored the subject names in a
single
By doing column.aBut
so, although few as per the 1st Normal form each column must
values are getting
contain repeated
atomic but
value.
values for the subject column
roll_no name subject
are now atomic for each
record/row. 101 Akon OS
Using the First Normal Form, 101 Akon CN
data redundancy increases, as 103 Ckon Java
there will be many columns with 102 Bkon C
same data in multiple rows but
each row as a whole will be 102 Bkon C++
Second Normal Form (2NF)
For a table to be in the Second Normal Form,
It should be in the First Normal form.
And, it should not have Partial Dependency.
s_id course
And, Hobbies Table,
1 Science
1 Maths
Now this relation
satisfies the 2 C#
fourth normal 2 Php
form.