0% found this document useful (0 votes)
55 views6 pages

Normalization 1

Normalization means

Uploaded by

Alijan Jan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views6 pages

Normalization 1

Normalization means

Uploaded by

Alijan Jan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Normalization

There are four types of anomalies, which are of concern, redundancy, insertion, deletion and
updation. Normalization is not compulsory, but it is strongly recommended that normalization must
be done. Because normalized design makes the maintenance of database much easier. While carrying
out the process of normalization, it should be applied on each table of database. It is performed after
the logical database design. This process is also being followed informally during conceptual database
design as well.

Normalization Process
There are different forms or levels of normalization. They are called as first, second and so on. Each
normalized form has certain requirements or conditions, which must be fulfilled. If a table or relation
fulfills any particular form then it is said to be in that normal form. The process is applied on each
relation of the database. The minimum form in which all the tables are in is called the normal form of
entire database. The main objective of normalization is to place the database in highest form of
normalization.

Functional Dependency
Normalization is based on the concept of functional dependency. A functional dependency is a type of
relationship between attributes.

Definition of Functional Dependency If A and B are attributes or sets of attributes of relation R, we say
that B is functionally dependent on A if each value of A in R has associated with it exactly one value of B
in R.

We write this as A B, read as “A functionally determines B” or “ A determines B”. This does not mean
that A causes B or that the value of B can be calculated from the value of A by a formula, although
sometimes that is the case. It simply means that if we know the value of A and we examine the table of
relation R, we will find only one value of B in all the rows that have the given value of A at any one time.
Thus then the two rows have the same A value, they must also have the same B value. However, for a
given B value, there may be several different A values. When a functional dependency exits, the
attributes or set of attributes on the left side of the arrow is called a determinant. Attribute of set of
attributes on left side are called determinant and on right are called dependants. If there is a relation R
with attributes (a,b,c,d,e)

a b,c,d d e For Example there is a relation of student with following attributes. We will establish
the functional dependency of different attributes: -

STD (stId,stName,stAdr,prName,credits)

stId stName,stAdr,prName,credits

prName credits

Now in this example if we know the stID we can tell the complete information about that student.
Similarly if we know the prName , we can tell the credit hours for any particular subject.

Functional Dependencies and Keys: We can determine the keys of a relation after seeing its functional
dependencies. The determinant of functional dependency that determines all attributes of that table
is the super key. Super key is an attribute or a set of attributes that identifies an entity uniquely. In a
table, a super key is any column or set of columns whose values can be used to distinguish one row
from another. A minimal super key is the candidate key , so if a determinant of functional dependency
determines all attributes of that relation then it is definitely a super key and if there is no other
functional dependency whereas a subset of this determinant is a super key then it is a candidate key.
So the functional dependencies help to identify keys. We have an example as under: -

EMP (eId,eName,eAdr,eDept,prId,prSal)

eId (eName,eAdr,eDept)

eId,prId prSal

Now in this example in the employee relation eId is the key from which we can uniquely determine the
employee name address and department . Similarly if we know the employee ID and project ID we can
find the project salary as well. So FDs help in finding out the keys and their relation as well.

Normal Forms
Normalization is basically; a process of efficiently organizing data in a database. There are two goals of
the normalization process: eliminate redundant data (for example, storing the same data in more than
one table) and ensure data dependencies make sense (only storing related data in a table). Both of
these are worthy goals as they reduce the amount of space a database consumes and ensure that data
is logically stored. We will now study the first normal form

First Normal Form: A relation is in first normal form if and only if every attribute is single valued for
each tuple. This means that each attribute in each row , or each cell of the table, contains only one
value. No repeating fields or groups are allowed. An alternative way of describing first normal form is
to say that the domains of attributes of a relation are atomic, that is they consist of single units that
cannot be broken down further. There is no multivalued (repeating group) in the relation multiple
values create problems in performing operations like select or join. For Example there is a relation of
Student
Now this table is in first normal form and for every tuple there is a unique value.

Second Normal Form:


A relation is in second normal form (2NF) if and only if it is in first normal form and all the non key
attributes are fully functionally dependent on the key. Clearly, if a relation is in 1NF and the key
consists of a single attribute, the relation is automatically in 2NF. The only time we have to be
concerned about 2NF is when the key is composite. Second normal form (2NF) addresses the concept
of removing duplicative data. It remove subsets of data that apply to multiple rows of a table and place
them in separate tables. It creates relationships between these new tables and their predecessors
through the use of foreign keys.

A relation is in second normal form if and only if it is in first normal form and all non key attributes are
fully functionally dependent on the key. Clearly if a relation is in 1NF and the key consists of a single
attribute, the relation is automatically 2NF. The only time we have to be concerned 2NF is when the
key is composite. A relation that is not in 2NF exhibits the update, insertion and deletion anomalies we
will now see it with an example. Consider the following relation.

Now in this relation the key is course ID and student ID. The requirement of 2NF is that all non-key
attributes should be fully dependent on the key there should be no partial dependency of the
attributes. But in this relation student ID is dependent on student name and similarly course ID is
partially dependent on faculty ID and room, so it is not in second normal form. At this level of
normalization, each column in a table that is not a determiner of the contents of another column must
itself be a function of the other columns in the table. For example, in a table with three columns
containing customer ID, product sold, and price of the product when sold, the price would be a
function of the customer ID (entitled to a discount) and the specific product. If a relation is not in 2NF
then there are some anomalies, which are as under:

• Redundancy

• Insertion Anomaly

• Deletion Anomaly

• Updation Anomaly

The general requirements of 2NF are:-

• Remove subsets of data that apply to multiple rows of a table and place them in separate rows.

• Create relationships between these new tables and their predecessors through the use of foreign
keys.

Consider the following table which has the anomalies:


Now the first thing is that the table is in 1NF because there are no duplicate values in any tuple and all
cells contain atomic value. The first thing is the redundancy. Like in this table of CLASS the course ID
C3456 is being repeated for faculty ID F2345 and similarly the room no 104 is being repeated twice.
Second is the insertion anomaly. Suppose we want to insert a course in the table, but this course has
not been registered to any student. But we cannot enter the student ID, because no student has
registered this course yet. So we can also not insert this course. This is called as insertion anomaly
which is wrong state of database. Next is the deletion anomaly. Suppose there is a course which has
been enrolled by one student only. Now due to some reason, we want to delete the record of student.
But here the information about the course will also be deleted, so in this way this is the incorrect state
of database in which infact we want to delete the information about the student record but along with
this the course information has also been deleted. So it is not reflecting the actual system. Now the
next is updation anomaly. Suppose a course has been registered by 50 students and now we want to
change the class rooms of all the students. So in this case we will have to change the records of all the
50 students. So this is again a deletion anomaly. The process for transforming a 1NF table to 2NF is:

• Identify any determinants other than the composite key, and the columns they determine.

• Create and name a new table for each determinant and the unique columns it determines.

• Move the determined columns from the original table to the new table. The determinate becomes
the primary key of the new table.

• Delete the columns you just moved from the original table except for the determinant which will
serve as a foreign key.

• The original table may be renamed to maintain semantic meaning.


Third Normal Form
A relational table is in third normal form (3NF) if it is already in 2NF and every non-key column is non-
transitively dependent upon its primary key. In other words, all non key attributes are functionally
dependent only upon the primary key.

Transitive Dependency
Transitive dependency is one that carries over another attribute. Transitive dependency occurs when
one non-key attribute determines another non-key attribute. For third normal form we concentrate
on relations with one candidate key, and we eliminate transitive dependencies. Transitive
dependencies cause insertion, deletion, and update anomalies. We will now see it with an example:-

Now here the table is in second normal form. As there is no partial dependency of any attributes here.
The key is student ID . The problem is of transitive dependency in which a non-key attribute can be
determined by a non-key attribute. Like here the program credits can be determined by program
name, which is not in 3NF. It also causes same four anomalies, which are due to transitive
dependencies. For Example:-
Now in this table all the four anomalies are exists in the table. So we will have to remove these
anomalies by decomposing this table after removing the transitive dependency. We will see it as
under: -

The process of transforming a table into 3NF is

Identify any determinants, other the primary key, and the columns they determine.

• Create and name a new table for each determinant and the unique columns it determines.

• Move the determined columns from the original table to the new table. The determinate becomes
the primary key of the new table.

• Delete the columns you just moved from the original table except for the determinate which will
serve as a foreign key.

• The original table may be renamed to maintain semantic meaning.

STD (stId, stName, stAdr, prName)

PROGRAM (prName, prCrdts)

We have now decomposed the relation into two relations of student and program. So the relations are
in third normal form and are free of all the anomalies

You might also like